easyidp.shp.read_shp#

easyidp.shp.read_shp(shp_path, shp_proj=None, name_field=-1, include_title=False, encoding='utf-8', return_proj=False)#

read shp file to python numpy object

Parameters:

shp_path (str) – the file path of *.shp
shp_proj (str | pyproj object) – by default None, will read automatically from prj file with the same name of shp filename, or give manually by read_shp(..., shp_proj=pyproj.CRS.from_epsg(4326), ...) or read_shp(..., shp_proj=r'path/to/{shp_name}.prj', ...)
name_field (str or int or list[ str|int ], optional) – by default None, the id or name of shp file fields as output dictionary keys
include_title (bool, optional) – by default False, whether add column name to roi key.
encoding (str) – by default ‘utf-8’, for some chinese characters, ‘gbk’ may required
return_proj (bool, optional) – by default False, if given as true, will return extra pyproj.CRS object of current shp file.

Returns:

dict, – the dictionary with read numpy polygon coordinates

{'id1': np.array([[x1,y1],[x2,y2],...]),
 'id2': np.array([[x1,y1],[x2,y2],...]),...}

pyproj.CRS, optional – once set return_proj=True

Example

The example shp file has the following columns:

[0] ID	[1] MASSIFID	[2] CROPTYPE	[3] CROPDATE	[4] CROPAREA	[5] ATTID
23010…0000	23010…0000	小麦	2018-09-01	61525.26302
23010…0012	23010…0012	蔬菜	2018-09-01	2802.33512
23010…0014	23010…0014	玉米	2018-09-01	6960.7745
23010…0061	23010…0061	牧草	2018-09-01	25349.08639
23010…0062	23010…0062	玉米	2018-09-01	71463.27666
…	…	…	…	…	…
23010…0582	23010…0582	胡萝卜	2018-09-01	288.23876
23010…0577	23010…0577	杂豆	2018-09-01	2001.80384
23010…0583	23010…0583	大豆	2018-09-01	380.41704
23010…0584	23010…0584	其它	2018-09-01	9133.25998
23010…0585	23010…0585	其它	2018-09-01	1704.27193

First, prepare data

>>> import easyidp as idp
>>> testdata = idp.data.TestData()
>>> data_path = testdata.shp.complex_shp

Then using the second column MASSIFID as shape keys:

>>> out = idp.shp.read_shp(data_path, name_field="MASSIFID", encoding='gbk')
>>> # or
>>> out = idp.shp.read_shp(data_path, name_field=1, encoding='gbk')
[shp][proj] Use projection [WGS 84] for loaded shapefile [complex_shp_review.shp]
[shp] read shp [complex_shp_review.shp]: 100%|███████████| 323/323 [00:02<00:00, 143.13it/s]
>>> out['23010...0000']
array([[ 45.83319255, 126.84383445],
       [ 45.83222256, 126.84212197],
       ...
       [ 45.83321205, 126.84381378],
       [ 45.83319255, 126.84383445]])

Due to the duplication of CROPTYPE, you can not using it as the unique key, but you can combine several columns together by passing a list to name_field:

>>> out = idp.shp.read_shp(data_path, name_field=["CROPTYPE", "MASSIFID"], encoding='gbk')
>>> # or
>>> out = idp.shp.read_shp(data_path, name_field=[2, 1], include_title=True, encoding='gbk')
[shp][proj] Use projection [WGS 84] for loaded shapefile [complex_shp_review.shp]
[shp] read shp [complex_shp_review.shp]: 100%|███████████| 323/323 [00:02<00:00, 143.13it/s]
>>> out.keys()
dict_keys(['小麦_23010...0000', '蔬菜_23010...0012', '玉米_23010...0014', ... ])

And you can also add column_names to id by include_title=True :

>>> out = idp.shp.read_shp(data_path, name_field=["CROPTYPE", "MASSIFID"], include_title=True, encoding='gbk')
>>> out.keys()
dict_keys(['CROPTYPE_小麦_MASSIFID_23010...0000', 'CROPTYPE_蔬菜_MASSIFID_23010...0012', ... ])