F. Goodchild, Michael (2011). Scale in GIS: An overview. 130 (2011), 5-9.

This review article discusses the relevance of scale, as a
disciplinary problem, within GIS and geomorphology. Three general problems of scale are
discussed, then two methods for conceptualizing phenomena, raster vs. vector
data, problems specific to GIS & modeling, & lastly, methods of
formalizing scale within various disciplines.

The first problem of scale is semantic. Scale is used in 3 senses in science:

1)
Cartographers refer to the *representative fraction *as the parameter that defines the scaling
of the earth to a sheet of paper

a.
ratio between map/ground

b.
defines level of detail, positional accuracy

c.
this is undefined for digital data

2) Extent of study area

3) Resolution

The second problem (resolution is finite):

1) The earth’s surface is infinitely complex &
could be mapped to molecules/smaller

2 2)
Praxis determines the most largest (usually most
relevant) features

a.
**This
makes me wonder why the largest features are usually most relevant**

The third problem (resolution in processes):

a) If the process is significantly influenced by
detail smaller than the spatial resolution of the data, then the results of
analysis/modeling will be misleading (this is the problem of downscaling which
geostatistics seeks to alleviate through the use of an inferred correlogram)

a.
**This
leads me to wonder if there are any processes that require all sorts of
different levels of scale**

b) Most theories are scale-free

a.
can’t tell whether the model’s error (all models
are imperfect) is due to spatial-resolution effects or the model, or both

There are two methods for conceptualizing geomorphological
phenomena, which are scale-independent theoretical frames:

1) Discrete objects

a.
the world’s surface is empty like a table top (**it is empty in the sense of it being simply
flat, not ***empty *empty or without a
coordinate system) except discrete *things*

b.
things can overlap

c.
good for biological organisms, vehicles,
buildings

d.
represented as points, lines, areas, volumes

e.
**so these
would be better able to map interactions between things (even if those things
are not ***things *per se), where in the
continuous-field this would be lost/aggregated/generalized, esp. raster models?

f.
**maybe we
can make a partial analogy between this and substance ontology?**

2) Continuous-field

a.
phenomena expressed as mappings from location to
value/class, so every location in space-time has exactly one value of each
variable

b.
topography, soil type, air temperature, land
cover class, soil moisture

c.
**process
ontology?**

Both of these can be raster or vector data, but raster can’t
be laid on curved surfaces

1)
Raster data has resolution explicit in the size
of its cells

a.
smaller cells, more of them = better resolution

b.
the intervals between samples defines this

c.
in 3-d sampling the vertical dimension is often
sampled differently than the horizontals leading to a differential in
resolution

i. e.g.
remote-viewed 2-d map + field photos of vertical dimension

2)
Vector resolution is poorly defined and
difficult/impossible to infer a posteriori from the contents of its data sets
(as most GIS work is done on already present data sets)

a.
if the data’s taken at irregular points the
distance between points may be used as resolution – **how often is the data taken irregularly?**

i. use
the minimum, mean, max nearest-neighbor distance?

b.
when data’s captured as attributes of areas,
it’s represented as a polygon/polyhedra

i. resolution
is the infinitely thin boundaries, density of sampling of the boundary,
within-area/volume variation that’s replaced by homogeneity, size of
areas/volumes

ii. this
creates the impression that vector data has infinite resolution

1.
at finer resolutions the numbers of
areas/volumes increase and their boundaries are given more detail but they’re
more homogeneous

GIS encounters a problem that applies to many types of geographic
data

-measuring the length of a digitized line

-a vector polyline’s length is
easily computed as the sum of straight-line segments

-if we assume each sampled point
lies exactly on the true line, we will get an underestimate of the truth
because no line has the points lie exactly straight

-the
underestimate is by an amount that depends on sampling density

-this applies to all natural
features, slope, land cover

-**this reminds me of stats (line regression) and calc (instantaneous
velocity)**

-**the issue of zooming out and losing detail/generalizing reminds me of
“essentialism” from philosophy, & i’m sure the problem of definition comes
up elsewhere in GIS (what variables to use, what sampling method)**

-the modified area unit problem shows us that aggregation or
intersections of mapping elements (data collection and state lines, for
example) will change correlations in a manner that is not random variation from
alternative samples

-mandelbrot tells us that the rate of info loss/gain is
constant/predictable through scaling laws and exhibits power-law behavior and
self-similarity

an image displaying self-similarity:

-geostatistics gives us spatial interpolation to refine the
spatial resolution of a point data set artificially through methods such as
spatial autocorrelation, correlograms, and variograms

**when creating an inferred correlogram does one infer up in levels/”steps”
till the desired resolution of the study?**

-wavelet analysis (a subset of Fourier analysis) allows the
decomposed field variables to vary spatially in a heirarchical fashion – **this sounds very interesting**

**In light of all of this, I am currently
wondering about the study of part-whole interactions in relation to resolution
and vector/raster datasets.**