从 shapefile 加载 ROI¶
本示例展示了如何将 shapefile ( *.shp ) 作为 easyidp.ROI 对象打开。
![]()
包和数据准备¶
使用以下代码加载 easyidp 包和示例数据集
[1]:
import easyidp as idp
test_data = idp.data.TestData()
如果您是第一次运行,它将自动从 Google Drive 下载大约 400MB 的数据集,请参考 Data 了解更多详情。
这里是本文件中使用的三个示例 shapefiles:
test_data.shp.lotus_shptest_data.shp.utm53n_shptest_data.shp.complex_shp
每个变量提供 *.shp 文件的路径:
[2]:
test_data.shp.lotus_shp
[2]:
PosixPath('/Users/hwang/Library/Application Support/easyidp.data/data_for_tests/shp_test/lotus_plots.shp')
处理 ESPG:4326(经度、纬度)¶
lotus_shp 使用 EPGS:4326 作为地理投影坐标
[3]:
lotus_roi = idp.ROI(test_data.shp.lotus_shp, name_field=0)
[shp][proj] Use projection [WGS 84] for loaded shapefile [lotus_plots.shp]
Read shapefile [lotus_plots.shp]: 100%|██████████| 112/112 [00:00<00:00, 1740.90it/s]
使用此方法检查该 shapefile 的 CRS(地理投影坐标):
[4]:
lotus_roi.crs
[4]:
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
我们可以通过以下方式检查地块多边形边界值:
[5]:
lotus_roi
[5]:
<easyidp.ROI> with 112 items
[0] N1W1
array([[139.54052962, 35.73475194],
[139.54055106, 35.73475596],
[139.54055592, 35.73473843],
[139.54053438, 35.73473446],
[139.54052962, 35.73475194]])
[1] N1W2
array([[139.54053488, 35.73473289],
[139.54055632, 35.73473691],
[139.54056118, 35.73471937],
[139.54053963, 35.73471541],
[139.54053488, 35.73473289]])
...
[110] S4E6
array([[139.54090456, 35.73453742],
[139.540926 , 35.73454144],
[139.54093086, 35.7345239 ],
[139.54090932, 35.73451994],
[139.54090456, 35.73453742]])
[111] S4E7
array([[139.54090986, 35.73451856],
[139.54093129, 35.73452258],
[139.54093616, 35.73450504],
[139.54091461, 35.73450107],
[139.54090986, 35.73451856]])
注意
在 easyidp 包中,ROI 的顺序是 (经度,纬度), 而对于一些其他包如 pyproj, shapely, 可能使用 (纬度,经度) 顺序,在包之间转换时请注意这一点
处理 UTM/Zone 地理坐标¶
utm53_shp 使用另一种地理坐标,而不是 (经度, 纬度)
[6]:
utm_roi = idp.ROI(test_data.shp.utm53n_shp, name_field=0)
[shp][proj] Use projection [WGS 84 / UTM zone 53N] for loaded shapefile [lon_lat_utm53n.shp]
Read shapefile [lon_lat_utm53n.shp]: 100%|██████████| 120/120 [00:00<00:00, 2506.63it/s]
然后我们可以检查此文件的 CRS(地理坐标):
[7]:
utm_roi.crs
[7]:
<Derived Projected CRS: EPSG:32653>
Name: WGS 84 / UTM zone 53N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- undefined
Coordinate Operation:
- name: UTM zone 53N
- method: Transverse Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
以及地块多边形坐标值:
[8]:
utm_roi
[8]:
<easyidp.ROI> with 120 items
[0] 1_02
array([[ 484581.96481965, 3862282.54659697],
[ 484581.71899749, 3862283.05047413],
[ 484581.92015662, 3862283.14994052],
[ 484582.17518015, 3862282.68486107],
[ 484581.96481965, 3862282.54659697]])
[1] 1_03
array([[ 484583.12131417, 3862280.30678878],
[ 484582.87549189, 3862280.81066591],
[ 484583.07665105, 3862280.91013233],
[ 484583.3316747 , 3862280.44505291],
[ 484583.12131417, 3862280.30678878]])
...
[118] 6_35
array([[ 484589.18991156, 3862259.98283581],
[ 484588.50691422, 3862261.2903283 ],
[ 484588.97321497, 3862261.50689677],
[ 484589.6197077 , 3862260.22275374],
[ 484589.18991156, 3862259.98283581]])
[119] 6_36
array([[ 484590.32812571, 3862257.73529679],
[ 484589.64512804, 3862259.04278919],
[ 484590.11143074, 3862259.26046666],
[ 484590.75792192, 3862257.97521477],
[ 484590.32812571, 3862257.73529679]])
在这种坐标下,单位是米,X(第一列)是东西(水平)方向,而 Y(第二列)是南北(垂直)方向。轴顺序与 (经度,纬度) 相同
在 CRS 之间转换¶
在某些情况下,例如,不是同一个人准备 DOM 和 shapefile,它们不共享相同的坐标并且多边形边界值非常不同。因此它们不能直接放在一起,需要转换。
注意
虽然 EasyIDP 支持不同 CRS 之间的转换,但如果 roi 数量庞大,可能会有精度损失并需要一些计算时间。
建议在准备 shp 和 DOM/DSM/PCD 时确保它们共享相同的 CRS。
例如,在莲花案例中,它使用 EPSG:4326,而 DOM/DSM 使用 UTM/Zone。
[9]:
lotus_roi.crs
[9]:
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
[10]:
dsm = idp.GeoTiff(test_data.metashape.lotus_dsm)
dsm.crs
[10]:
<Derived Projected CRS: EPSG:32654>
Name: WGS 84 / UTM zone 54N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Between 138°E and 144°E, northern hemisphere between equator and 84°N, onshore and offshore. Japan. Russian Federation.
- bounds: (138.0, 0.0, 144.0, 84.0)
Coordinate Operation:
- name: UTM zone 54N
- method: Transverse Mercator
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
因此需要进行 CRS 转换以匹配它们。更推荐转移 ROI,因为它只是坐标数字,比 GeoTiff 的像素矩阵更容易转换。
在转换之前,地块坐标看起来像这样:
[11]:
lotus_roi
[11]:
<easyidp.ROI> with 112 items
[0] N1W1
array([[139.54052962, 35.73475194],
[139.54055106, 35.73475596],
[139.54055592, 35.73473843],
[139.54053438, 35.73473446],
[139.54052962, 35.73475194]])
[1] N1W2
array([[139.54053488, 35.73473289],
[139.54055632, 35.73473691],
[139.54056118, 35.73471937],
[139.54053963, 35.73471541],
[139.54053488, 35.73473289]])
...
[110] S4E6
array([[139.54090456, 35.73453742],
[139.540926 , 35.73454144],
[139.54093086, 35.7345239 ],
[139.54090932, 35.73451994],
[139.54090456, 35.73453742]])
[111] S4E7
array([[139.54090986, 35.73451856],
[139.54093129, 35.73452258],
[139.54093616, 35.73450504],
[139.54091461, 35.73450107],
[139.54090986, 35.73451856]])
并应用转换:
[12]:
lotus_roi.change_crs(dsm.crs)
现在 ROI 的 CRS 已经改变:
[13]:
lotus_roi.crs
[13]:
<Derived Projected CRS: EPSG:32654>
Name: WGS 84 / UTM zone 54N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Between 138°E and 144°E, northern hemisphere between equator and 84°N, onshore and offshore. Japan. Russian Federation.
- bounds: (138.0, 0.0, 144.0, 84.0)
Coordinate Operation:
- name: UTM zone 54N
- method: Transverse Mercator
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
以及坐标值:
[14]:
lotus_roi
[14]:
<easyidp.ROI> with 112 items
[0] N1W1
array([[ 368017.7565143 , 3955511.08102276],
[ 368019.70190232, 3955511.49811902],
[ 368020.11263046, 3955509.54636219],
[ 368018.15769062, 3955509.13563382],
[ 368017.7565143 , 3955511.08102276]])
[1] N1W2
array([[ 368018.20042946, 3955508.96051697],
[ 368020.14581791, 3955509.37761334],
[ 368020.55654627, 3955507.42585654],
[ 368018.601606 , 3955507.01512806],
[ 368018.20042946, 3955508.96051697]])
...
[110] S4E6
array([[ 368051.31139629, 3955486.78103425],
[ 368053.25678767, 3955487.19813795],
[ 368053.66752456, 3955485.24638299],
[ 368051.71258131, 3955484.83564713],
[ 368051.31139629, 3955486.78103425]])
[111] S4E7
array([[ 368051.75902187, 3955484.68169527],
[ 368053.70441367, 3955485.09879908],
[ 368054.11515079, 3955483.14704415],
[ 368052.16020711, 3955482.73630818],
[ 368051.75902187, 3955484.68169527]])
更改 ROI 标签¶
您必须注意在打开 shapefile 时的 name_field 值
lotus_roi = idp.ROI(test_data.shp.lotus_shp, name_field=0)
在本节中,将介绍有关此参数和其他控制的更多详细信息。
对于某些 shapefile,它不是用 utf-8 编码的,默认加载可能会失败:
>>> idp.shp.show_shp_fields(test_data.shp.complex_shp)
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
/Users/hwang/OneDrive/Program/GitHub/EasyIDP/docs/jupyter/load_roi.ipynb Cell 35 in <cell line: 1>()
----> 1 idp.shp.show_shp_fields(test_data.shp.complex_shp)
File ~/OneDrive/Program/GitHub/EasyIDP/easyidp/shp.py:114, in show_shp_fields(shp_path, encoding)
111 head = ["[-1]"] + [f"[{v}] {k}" for k, v in shp_fields.items()]
112 data = []
--> 114 row_num = len(shp.records())
115 col_num = len(shp.records()[0])
117 col_align = ["right"] + ["center"] * col_num
File ~/anaconda/miniconda3/envs/easyidp/lib/python3.8/site-packages/shapefile.py:1306, in Reader.records(self)
1304 f.seek(self.__dbfHdrLength)
1305 for i in range(self.numRecords):
-> 1306 r = self.__record(oid=i)
1307 if r:
1308 records.append(r)
File ~/anaconda/miniconda3/envs/easyidp/lib/python3.8/site-packages/shapefile.py:1281, in Reader.__record(self, oid)
1278 value = None # unknown value is set to missing
1279 else:
1280 # anything else is forced to string/unicode
-> 1281 value = u(value, self.encoding, self.encodingErrors)
1282 value = value.strip()
...
105 elif isinstance(v, str):
106 # Already str.
107 return v
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc2 in position 2: invalid continuation byte
[15]:
idp.shp.show_shp_fields(test_data.shp.complex_shp, encoding='gbk')
[-1] [0] ID [1] MASSIFID [2] CROPTYPE [3] CROPDATE [4] CROPAREA [5] ATTID
------ --------------------------- ------------------- -------------- -------------- -------------- -----------
0 230104112201809010000000000 2301041120000000000 小麦 2018-09-01 61525.26302
1 230104112201809010000000012 2301041120000000012 蔬菜 2018-09-01 2802.33512
2 230104112201809010000000014 2301041120000000014 玉米 2018-09-01 6960.7745
... ... ... ... ... ... ...
320 230104112201809010000000583 2301041120000000583 大豆 2018-09-01 380.41704
321 230104112201809010000000584 2301041120000000584 其它 2018-09-01 9133.25998
322 230104112201809010000000585 2301041120000000585 其它 2018-09-01 1704.27193
默认情况下,如果未给出 name_field,将使用第一个编号 id 作为标签
[16]:
complex_roi = idp.ROI(test_data.shp.complex_shp, encoding='gbk')
/Users/hwang/OneDrive/Program/GitHub/EasyIDP/easyidp/shp.py:399: UserWarning: Not specifying parameter 'name_field', will using the row id (from 0 to end) as the index for each polygon.Please using idp.shp.show_shp_field(shp_path) to display the full available indexs
warnings.warn(
[shp][proj] Use projection [WGS 84] for loaded shapefile [complex_shp_review.shp]
Read shapefile [complex_shp_review.shp]: 100%|██████████| 323/323 [00:00<00:00, 64383.62it/s]
[17]:
complex_roi
[17]:
<easyidp.ROI> with 323 items
[0] 0
array([[126.84383445, 45.83319255],
[126.84212197, 45.83222256],
[126.84142718, 45.83186291],
...,
[126.84373784, 45.83328959],
[126.84381378, 45.83321205],
[126.84383445, 45.83319255]])
[1] 1
array([[126.85042747, 45.84588275],
[126.85042684, 45.84570483],
[126.8504088 , 45.84570504],
...,
[126.85006453, 45.84588545],
[126.85033812, 45.84588368],
[126.85042747, 45.84588275]])
...
[321] 321
array([[126.83385574, 45.84337042],
[126.83385363, 45.84331501],
[126.83384205, 45.8432624 ],
...,
[126.83353615, 45.84359357],
[126.83370583, 45.84350153],
[126.83385574, 45.84337042]])
[322] 322
array([[126.83056335, 45.84335627],
[126.83063002, 45.84332396],
[126.83067953, 45.84328632],
...,
[126.83025148, 45.84341107],
[126.83039799, 45.84340243],
[126.83056335, 45.84335627]])
We can specify the ‘MASSIFID’ as the label instead
[18]:
complex_roi = idp.ROI(test_data.shp.complex_shp, name_field='MASSIFID', encoding='gbk')
[shp][proj] Use projection [WGS 84] for loaded shapefile [complex_shp_review.shp]
Read shapefile [complex_shp_review.shp]: 100%|██████████| 323/323 [00:00<00:00, 333.24it/s]
或者通过列 id 达到相同效果:
complex_roi = idp.ROI(test_data.shp.complex_shp, name_field=1, encoding='gbk')
现在标签已经改变:
[19]:
complex_roi
[19]:
<easyidp.ROI> with 323 items
[0] 2301041120000000000
array([[126.84383445, 45.83319255],
[126.84212197, 45.83222256],
[126.84142718, 45.83186291],
...,
[126.84373784, 45.83328959],
[126.84381378, 45.83321205],
[126.84383445, 45.83319255]])
[1] 2301041120000000012
array([[126.85042747, 45.84588275],
[126.85042684, 45.84570483],
[126.8504088 , 45.84570504],
...,
[126.85006453, 45.84588545],
[126.85033812, 45.84588368],
[126.85042747, 45.84588275]])
...
[321] 2301041120000000584
array([[126.83385574, 45.84337042],
[126.83385363, 45.84331501],
[126.83384205, 45.8432624 ],
...,
[126.83353615, 45.84359357],
[126.83370583, 45.84350153],
[126.83385574, 45.84337042]])
[322] 2301041120000000585
array([[126.83056335, 45.84335627],
[126.83063002, 45.84332396],
[126.83067953, 45.84328632],
...,
[126.83025148, 45.84341107],
[126.83039799, 45.84340243],
[126.83056335, 45.84335627]])
或者您可以通过提供列表 name_field=['CROPTYPE', 'MASSIFID'] 将几列组合在一起
[20]:
complex_roi = idp.ROI(test_data.shp.complex_shp, name_field=['CROPTYPE', 'MASSIFID'], encoding='gbk')
[shp][proj] Use projection [WGS 84] for loaded shapefile [complex_shp_review.shp]
Read shapefile [complex_shp_review.shp]: 100%|██████████| 323/323 [00:01<00:00, 163.54it/s]
或者通过列 id 列表 name_field=[2, 1] 达到相同效果
complex_roi = idp.ROI(test_data.shp.complex_shp, name_field=[2, 1], encoding='gbk')
现在标签已经改变:
[21]:
complex_roi
[21]:
<easyidp.ROI> with 323 items
[0] 小麦_2301041120000000000
array([[126.84383445, 45.83319255],
[126.84212197, 45.83222256],
[126.84142718, 45.83186291],
...,
[126.84373784, 45.83328959],
[126.84381378, 45.83321205],
[126.84383445, 45.83319255]])
[1] 蔬菜_2301041120000000012
array([[126.85042747, 45.84588275],
[126.85042684, 45.84570483],
[126.8504088 , 45.84570504],
...,
[126.85006453, 45.84588545],
[126.85033812, 45.84588368],
[126.85042747, 45.84588275]])
...
[321] 其它_2301041120000000584
array([[126.83385574, 45.84337042],
[126.83385363, 45.84331501],
[126.83384205, 45.8432624 ],
...,
[126.83353615, 45.84359357],
[126.83370583, 45.84350153],
[126.83385574, 45.84337042]])
[322] 其它_2301041120000000585
array([[126.83056335, 45.84335627],
[126.83063002, 45.84332396],
[126.83067953, 45.84328632],
...,
[126.83025148, 45.84341107],
[126.83039799, 45.84340243],
[126.83056335, 45.84335627]])
或者通过提供 include_title=True 将列标题也添加到其中
[22]:
complex_roi = idp.ROI(test_data.shp.complex_shp, name_field=['CROPTYPE', 'MASSIFID'], include_title=True, encoding='gbk')
[shp][proj] Use projection [WGS 84] for loaded shapefile [complex_shp_review.shp]
Read shapefile [complex_shp_review.shp]: 100%|██████████| 323/323 [00:01<00:00, 171.82it/s]
[23]:
complex_roi
[23]:
<easyidp.ROI> with 323 items
[0] CROPTYPE_小麦_MASSIFID_2301041120000000000
array([[126.84383445, 45.83319255],
[126.84212197, 45.83222256],
[126.84142718, 45.83186291],
...,
[126.84373784, 45.83328959],
[126.84381378, 45.83321205],
[126.84383445, 45.83319255]])
[1] CROPTYPE_蔬菜_MASSIFID_2301041120000000012
array([[126.85042747, 45.84588275],
[126.85042684, 45.84570483],
[126.8504088 , 45.84570504],
...,
[126.85006453, 45.84588545],
[126.85033812, 45.84588368],
[126.85042747, 45.84588275]])
...
[321] CROPTYPE_其它_MASSIFID_2301041120000000584
array([[126.83385574, 45.84337042],
[126.83385363, 45.84331501],
[126.83384205, 45.8432624 ],
...,
[126.83353615, 45.84359357],
[126.83370583, 45.84350153],
[126.83385574, 45.84337042]])
[322] CROPTYPE_其它_MASSIFID_2301041120000000585
array([[126.83056335, 45.84335627],
[126.83063002, 45.84332396],
[126.83067953, 45.84328632],
...,
[126.83025148, 45.84341107],
[126.83039799, 45.84340243],
[126.83056335, 45.84335627]])
注意
只有具有唯一值的列应作为 ROI 标签使用,否则存在后续重复标签覆盖之前标签的风险。easyidp 可以处理这种情况并抛出错误。
>>> complex_roi = idp.ROI(test_data.shp.complex_shp, name_field='CROPTYPE',encoding='gbk')
[shp][proj] Use projection [WGS 84] for loaded shapefile [complex_shp_review.shp]
Read shapefile [complex_shp_review.shp]: 1%| | 4/323 [00:00<00:01, 221.13it/s]
Output exceeds the size limit. Open the full output data in a text editor
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/Users/hwang/OneDrive/Program/GitHub/EasyIDP/docs/jupyter/load_roi.ipynb Cell 51 in <cell line: 1>()
----> 1 complex_roi = idp.ROI(test_data.shp.complex_shp, name_field='CROPTYPE',encoding='gbk')
File ~/OneDrive/Program/GitHub/EasyIDP/easyidp/roi.py:77, in ROI.__init__(self, target_path, **kwargs)
74 self.source = target_path
76 if target_path is not None:
---> 77 self.open(target_path, **kwargs)
File ~/OneDrive/Program/GitHub/EasyIDP/easyidp/roi.py:136, in ROI.open(self, target_path, **kwargs)
134 ext = os.path.splitext(target_path)[-1]
135 if ext == ".shp":
--> 136 self.read_shp(target_path, **kwargs)
137 elif ext == ".json":
138 self.read_labelme_json(target_path)
File ~/OneDrive/Program/GitHub/EasyIDP/easyidp/roi.py:215, in ROI.read_shp(self, shp_path, shp_proj, name_field, include_title, encoding)
142 """read ROI from shp file
143
144 Parameters
(...)
211
212 """
213 # if geotiff_proj is not None and shp_proj is not None and shp_proj.name != geotiff_proj.name:
...
--> 323 raise KeyError(f"Meet with duplicated key [{plot_name}] for current shapefile, please specify another `name_field` from {shp_fields} or simple leave it blank `name_field=None`")
325 shp_dict[plot_name] = coord_np
327 if return_proj:
KeyError: "Meet with duplicated key [玉米] for current shapefile, please specify another `name_field` from {'ID': 0, 'MASSIFID': 1, 'CROPTYPE': 2, 'CROPDATE': 3, 'CROPAREA': 4, 'ATTID': 5} or simple leave it blank `name_field=None`"