AI

XML to TXT annotation file format 변환

전자둥이 2022. 7. 26. 16:14
반응형

안녕하세요.

Custom data를 사용하여 YoloV5s 모델을 사용하려다 보니 dataset annotation format이 txt파일 포맷인 걸 알고 난 후 기존에 가지고 있던 데이터셋의 annotation format을 바꾸는 작업을 하려고 합니다.

데이터 로더 부분을 수정하는 방법도 있겠지만 학습코드를 건드리지 않는 선에서 진행을 하고 싶어 이 방법을 택했습니다.

 

* 사용한 학습 코드

https://github.com/ultralytics/yolov5

 

GitHub - ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite. Contribute to ultralytics/yolov5 development by creating an account on GitHub.

github.com

 

- yolov5s annotation file format

annotation format

다음과 같이 class x_center y_center width height 순으로 box정보를 적어줘야합니다.

예시

제가 가지고 있던 Custom dataset의 annotation format이 PASCAL VOC의 annotation format을 가지고 있었기에 해당 포맷을 전부다 yolov5에서 원하는 형태로 바꿔줘야 했습니다.

 

- 기존 파일

        * (참고) 기존에 가지고 있던 annotation file입니다.

<annotation>
	<folder>your_folder_name</folder>
	<filename>273271,1a27b000f0c7a077.jpg</filename>
	<source>
		<database>your_database</database>
	</source>
	<size>
		<width>1000</width>
		<height>667</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>56</xmin>
			<ymin>211</ymin>
			<xmax>174</xmax>
			<ymax>368</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>164</xmin>
			<ymin>215</ymin>
			<xmax>250</xmax>
			<ymax>355</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>231</xmin>
			<ymin>232</ymin>
			<xmax>318</xmax>
			<ymax>376</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>308</xmin>
			<ymin>221</ymin>
			<xmax>405</xmax>
			<ymax>491</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>358</xmin>
			<ymin>208</ymin>
			<xmax>435</xmax>
			<ymax>472</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>438</xmin>
			<ymin>206</ymin>
			<xmax>518</xmax>
			<ymax>385</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>526</xmin>
			<ymin>229</ymin>
			<xmax>603</xmax>
			<ymax>340</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>605</xmin>
			<ymin>220</ymin>
			<xmax>685</xmax>
			<ymax>454</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>678</xmin>
			<ymin>215</ymin>
			<xmax>767</xmax>
			<ymax>489</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>764</xmin>
			<ymin>224</ymin>
			<xmax>833</xmax>
			<ymax>483</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>787</xmin>
			<ymin>233</ymin>
			<xmax>885</xmax>
			<ymax>482</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>868</xmin>
			<ymin>242</ymin>
			<xmax>975</xmax>
			<ymax>533</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>703</xmin>
			<ymin>246</ymin>
			<xmax>787</xmax>
			<ymax>423</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>771</xmin>
			<ymin>349</ymin>
			<xmax>900</xmax>
			<ymax>576</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>693</xmin>
			<ymin>335</ymin>
			<xmax>807</xmax>
			<ymax>563</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>245</xmin>
			<ymin>328</ymin>
			<xmax>369</xmax>
			<ymax>538</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>211</xmin>
			<ymin>300</ymin>
			<xmax>294</xmax>
			<ymax>421</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>135</xmin>
			<ymin>343</ymin>
			<xmax>260</xmax>
			<ymax>542</ymax>
		</bndbox>
	</object>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox> 
			<xmin>37</xmin>
			<ymin>313</ymin>
			<xmax>179</xmax>
			<ymax>539</ymax>
		</bndbox>
	</object>
</annotation>

 

 

- 결과물

      * 컨버팅 후 결과물, 참고로 person class는 14번 입니다. (PASCAL 기준)

14 0.115 0.434033 0.118 0.235382
14 0.207 0.427286 0.086 0.209895
14 0.2745 0.455772 0.087 0.215892
14 0.3565 0.533733 0.097 0.404798
14 0.3965 0.509745 0.077 0.395802
14 0.478 0.443028 0.08 0.268366
14 0.5645 0.426537 0.077 0.166417
14 0.645 0.505247 0.08 0.350825
14 0.7225 0.527736 0.089 0.410795
14 0.7985 0.529985 0.069 0.388306
14 0.836 0.535982 0.098 0.373313
14 0.9215 0.58096 0.107 0.436282
14 0.745 0.501499 0.084 0.265367
14 0.8355 0.693403 0.129 0.34033
14 0.75 0.673163 0.114 0.341829
14 0.307 0.649175 0.124 0.314843
14 0.2525 0.54048 0.083 0.181409
14 0.1975 0.663418 0.125 0.298351
14 0.108 0.638681 0.142 0.338831

 

*Code

from logging import raiseExceptions
import os
import xml.etree.ElementTree as ET

"""

read xml
width, height, class name, xmin, ymin, xmax, ymax

made by : ysjo

"""


## PASCAL VOC
PASCAL_Class_index = {"aeroplane": 0,
                "bicycle": 1,
                "bird": 2,
                "boat": 3,
                "bottle": 4,
                "bus": 5,
                "car": 6,
                "cat": 7,
                "chair": 8,
                "cow": 9,
                "diningtable": 10,
                "dog": 11,
                "horse": 12,
                "motorbike": 13,
                "person": 14,
                "pottedplant": 15,
                "sheep": 16,
                "sofa": 17,
                "train": 18,
                "tvmonitor": 19}

XML_DIRECTORY = "./xml/"
TXT_DIRECTORY = "./txt/"


def Write_TXT(file_name, width, height, result):
    file_name = file_name[:-3]+"txt"
    file_path = os.path.join(TXT_DIRECTORY, file_name)
    f = open(file_path, 'w')
    for i, data in enumerate(result):
        data = f"{data}\n"
        data = data.replace(",","").replace("[","").replace("]","")
        f.write(data)
    f.close()


def Read_XML(file_path, file_name):
    tree = ET.parse(file_path)
    root = tree.getroot()
    ## size inform
    size = root.find("size")
    width = float(size.find("width").text)
    height = float(size.find("height").text)

    ## box inform
    result = list()
    for object in root.findall('object'):
        name = object.find("name").text
        class_index = PASCAL_Class_index[name]
        bndbox = object.find("bndbox")
        xmin = float(bndbox.find("xmin").text)
        ymin = float(bndbox.find("ymin").text)
        xmax = float(bndbox.find("xmax").text)
        ymax = float(bndbox.find("ymax").text)
        bnd_width = round((xmax-xmin)/width,6)
        bnd_height = round((ymax-ymin)/height,6)
        x_center = round((xmax+xmin)/2/width,6)
        y_center = round((ymax+ymin)/2/height,6)
        result.append([class_index, x_center, y_center, bnd_width, bnd_height])
    Write_TXT(file_name=file_name, width=width, height=height, result=result)


def createFolder(directory):
    try:
        if not os.path.exists(directory):
            os.makedirs(directory)
    except OSError:
        print ('Error: Creating directory. ' +  directory)


def main():
    if not os.path.isdir(XML_DIRECTORY):
        raise Exception("no XML DIr")
    createFolder(TXT_DIRECTORY)
    for (root, directories, files) in os.walk(XML_DIRECTORY):
        for file in files:
            if '.xml' in file:
                file_path = os.path.join(root, file)
                Read_XML(file_path, file)


if __name__=="__main__":
    main()

argparse나 tqdm을 사용해서 조금 더 사용하기 편하게 코드를 추가해도 좋을것 같습니다.

 

이상입니다~~

반응형