DOwnload the shapefile from here https://catalog.data.gov/dataset/tiger-line-shapefile-2019-2010-nation-u-s-2010-census-5-digit-zip-code-tabulation-area-zcta5-na
Simplifying using GDAL
We can use the ogr2ogr command from the GDAL library to convert the shapefile to geojson but even with only one field and simple coordinates the output file is over 1GB.
ogr2ogr -f GeoJSON -select ZCTA5CE10 -lco COORDINATE_PRECISION=6 zcta.geojson /vsizip/tl_2017_us_zcta510.zip
I tried to simplify this to topojson, but the topojson library chokes on this even on a very powerful 2017 MacBook Pro.
npx topojson -q 1e4 -o zcta_topo.json zcta.geojson
>> JavaScript head out of memory
Another method I tried was using the -simplify option in ogr2ogr. The simplify argument is a unit of measure based on the spatial reference system of the shapefile. Since the srs for the ZCTAs is WGS84 the unit is a lat/lon measure.
ogr2ogr -f "GeoJSON" -lco COORDINATE_PRECISION=6 -select ZCTA5CE10 -simplify 0.006 zcta.geojson /vsizip/tl_2017_us_zcta510.zip
This creates a much smaller GeoJSON file (30MB) which the TopoJSON can easily handle and we end up with a more managable (but still too large) 13MB topojson file. Additionally, the topology of the dataset is very poor at medium to large scales.
npx topojson -q 1e5 -o zcta_topo.json zcta.geojson
Simplifying using Postgis
Create a docker volume to use for persistence
docker volume create postgresql
Run the postgis docker
docker run --name postgis -p 25432:5432 -it --mount source=postgresql,target=/var/lib/postgresql kartoza/postgis
Load the zcta shapefile into postgis
ogr2ogr -f "PostgreSQL" -progress -select "ZCTA5CE10" -overwrite -lco OVERWRITE=yes -nln zcta -nlt PROMOTE_TO_MULTI -t_srs "EPSG:4326" PG:"dbname='gis' host='localhost' port='25432' user='docker' password='docker'" ~/Downloads/tl_2017_us_zcta510/tl_2017_us_zcta510.shp
Sample query with st_simplifypreservetopology (New England). This takes a long time to run for the entire country and we still lose a lot of the topology.
select st_simplifypreservetopology(wkb_geometry, 0.025) as thegeom, zcta5ce10 from zcta where zcta5ce10 like '0%' OR zcta5ce10 like '1%'
Simplifying using Mapshaper (Best solution)
The Mapshaper library can output TopoJSON directly from the shapefile without JavaScript memory heap errors. This command creates a ~6MB topojson file that we can use. It also manages to keep topology very well by assuming that very close verticies and edges should be coincident.
npx -p mapshaper mapshaper-xl tl_2017_us_zcta510.shp snap -simplify 0.1% -filter-fields ZCTA5CE10 -rename-fields zip=ZCTA5CE10 -o format=topojson zcta_mapshaper.json
source:https://github.com/elastic/ems-file-service/issues/6