easyidp.shp.read_shp#

easyidp.shp.read_shp(shp_path, shp_proj=None, name_field=-1, include_title=False, encoding='utf-8', return_proj=False)#

read shp file to python numpy object

Parameters:
  • shp_path (str) – the file path of *.shp

  • shp_proj (str | pyproj object) – by default None, will read automatically from prj file with the same name of shp filename, or give manually by read_shp(..., shp_proj=pyproj.CRS.from_epsg(4326), ...) or read_shp(..., shp_proj=r'path/to/{shp_name}.prj', ...)

  • name_field (str or int or list[ str|int ], optional) – by default None, the id or name of shp file fields as output dictionary keys

  • include_title (bool, optional) – by default False, whether add column name to roi key.

  • encoding (str) – by default ‘utf-8’, for some chinese characters, ‘gbk’ may required

  • return_proj (bool, optional) – by default False, if given as true, will return extra pyproj.CRS object of current shp file.

Returns:

  • dict, – the dictionary with read numpy polygon coordinates

    {'id1': np.array([[x1,y1],[x2,y2],...]),
     'id2': np.array([[x1,y1],[x2,y2],...]),...}
    
  • pyproj.CRS, optional – once set return_proj=True

Example

The example shp file has the following columns:

[0] ID

[1] MASSIFID

[2] CROPTYPE

[3] CROPDATE

[4] CROPAREA

[5] ATTID

23010…0000

23010…0000

小麦

2018-09-01

61525.26302

23010…0012

23010…0012

蔬菜

2018-09-01

2802.33512

23010…0014

23010…0014

玉米

2018-09-01

6960.7745

23010…0061

23010…0061

牧草

2018-09-01

25349.08639

23010…0062

23010…0062

玉米

2018-09-01

71463.27666

23010…0582

23010…0582

胡萝卜

2018-09-01

288.23876

23010…0577

23010…0577

杂豆

2018-09-01

2001.80384

23010…0583

23010…0583

大豆

2018-09-01

380.41704

23010…0584

23010…0584

其它

2018-09-01

9133.25998

23010…0585

23010…0585

其它

2018-09-01

1704.27193

First, prepare data

>>> import easyidp as idp
>>> testdata = idp.data.TestData()
>>> data_path = testdata.shp.complex_shp

Then using the second column MASSIFID as shape keys:

>>> out = idp.shp.read_shp(data_path, name_field="MASSIFID", encoding='gbk')
>>> # or
>>> out = idp.shp.read_shp(data_path, name_field=1, encoding='gbk')
[shp][proj] Use projection [WGS 84] for loaded shapefile [complex_shp_review.shp]
[shp] read shp [complex_shp_review.shp]: 100%|███████████| 323/323 [00:02<00:00, 143.13it/s]
>>> out['23010...0000']
array([[ 45.83319255, 126.84383445],
       [ 45.83222256, 126.84212197],
       ...
       [ 45.83321205, 126.84381378],
       [ 45.83319255, 126.84383445]])

Due to the duplication of CROPTYPE, you can not using it as the unique key, but you can combine several columns together by passing a list to name_field:

>>> out = idp.shp.read_shp(data_path, name_field=["CROPTYPE", "MASSIFID"], encoding='gbk')
>>> # or
>>> out = idp.shp.read_shp(data_path, name_field=[2, 1], include_title=True, encoding='gbk')
[shp][proj] Use projection [WGS 84] for loaded shapefile [complex_shp_review.shp]
[shp] read shp [complex_shp_review.shp]: 100%|███████████| 323/323 [00:02<00:00, 143.13it/s]
>>> out.keys()
dict_keys(['小麦_23010...0000', '蔬菜_23010...0012', '玉米_23010...0014', ... ])

And you can also add column_names to id by include_title=True :

>>> out = idp.shp.read_shp(data_path, name_field=["CROPTYPE", "MASSIFID"], include_title=True, encoding='gbk')
>>> out.keys()
dict_keys(['CROPTYPE_小麦_MASSIFID_23010...0000', 'CROPTYPE_蔬菜_MASSIFID_23010...0012', ... ])