Published March 31, 2020
| Version
v1
Dataset
Open
Strategies to access web-enabled urban spatial data for socioeconomic research using R functions [Code]
- 1. Universidad Católica del Norte, Chile
- 2. Universidad Autónoma de Madrid, Spain
- 3. Universidad Católica de Ávila, Spain
Description
Code accompanying the publication "Strategies to access web-enabled urban spatial data for socioeconomic research using R functions". Since the introduction of the World Wide Web in the 1990s, available information for research purposes has increased exponentially leading to a significant proliferation of web-based research. Nowadays it is common the use of internet-based databases which are obtained by either primary data online surveys or secondary official and non-official registers. However, information disposal varies depending on data category and country and specifically, the collection of microdata at low geographical level for urban analysis can be a challenge. The most common difficulties when working with secondary web-based data can be grouped into two categories: accessibility and availability problems. Accessibility problems are present when the data publication in the servers blocks or delays the download process, which becomes a tedious reiterative task that can produce errors in the construction of big databases. Availability problems usually arise when the official agencies restrict access to the information for statistical confidentiality reasons. In order to overcome some of these problems, this paper presents different strategies based on URL parsing, PDF text extraction and web scraping. A set of functions, which are available under a GPL-2 license, have been built in the R package specially to extract and organize databases at the municipality level (NUTS 5) in Spain on population, unemployment, vehicle fleet and firm.
Files
Files
(17.5 kB)
| Name | Size | Download all |
|---|---|---|
|
Checksum: md5:d7bcd6e79ca8830ec0ab05f366891dd6
PID: http://hdl.handle.net/11304/d875107a-2fd9-4345-b480-fb2ca7347885 |
1.1 kB | Download |
|
Checksum: md5:9e6b0038bf611d076455e5e05c2f2a65
PID: http://hdl.handle.net/11304/bc85e61a-3f33-457b-acb7-cd6e266f08f2 |
1.4 kB | Download |
|
Checksum: md5:7e50987a856f020d7f4ad09f50a7a2a3
PID: http://hdl.handle.net/11304/19ff1766-71d6-4909-9dd8-02754beac2b7 |
1.2 kB | Download |
|
Checksum: md5:a4e73274be5fbd84a24c398ef0d644b7
PID: http://hdl.handle.net/11304/996e8419-2e78-44ec-97b0-ad4a6ff8acd4 |
13.8 kB | Download |
Additional details
Identifiers
- b2rec
- e6335452f30a456d8eb9e8065a29955e
Funding
- Spanish Ministry of Economics and Competitiveness, grant number ECO2015-65758-P
- Regional Government of Extremadura (Spain).