Python YAML ParserIn this tutorial, we will learn how to read, write or perform various operations on YAML files using Python. We will discuss the YAML file format, its usage, and how we can manipulate it using Python. Let's have a brief introduction of YAML. What is YAML?YAML is an abbreviation of Yet Another Markup Language. It stores the configuration file data in a serialized manner; it has gained much popularity in recent years since it is a human-readable data format and is often used in data storage or transmission. YAML supports the three data types - scalars (strings, integers, and floats), lists, and associative arrays. The YAML files are saved with the .yaml or .yml extension. We can use the comment in YAML using the # symbol. A hyphen precedes each subitem inside. The values can be nested using the indentation. Advantages of YAMLSome important advantages of YAML are as follows.
Before starting further, we assume that you have a basic understanding of Python or beginner-level programming experience with the Python programming language. PyYAML ModulePyYAML is a Python module that provides a range of methods to perform several operations on the YAML file. We can easily convert the YAML file into the Dictionary and read its content. With the help of the YAML module, we can read write complex configuration YAML files, serializing and persisting YAML data. To use the PyYAML, we need to install it in our system. Below are the installation steps of the PyYAML module. Installing PyYAMLWe can install it using the below method.
Using pip command We can install it using the pip command. Type the following command in the terminal to install PyYAML module. Install via source code We can use the alternative way of installation in case of facing error using the pip command. Follow the below instructions.
Reading YAML FileFirst, we create a new YAML file named sample.yaml file that will use to read using the PyYAML module. sample.py The yaml.load() method is used to read the YAML file. This method parses and converts the YAML object to a Python dictionary so that we can read the content easily. This process is called the Deserialization of YAML files into Python. The load() method takes one argument, which can be either a byte string, an open binary file object, a Unicode string, or an open YAML file object. If we pass the file or byte-string as an argument, it should be encoded in utf-8, utf-16-be, or utf-16-le. Let's understand the following example. Example - Output: [{'UserName': 'Antonio', 'Password': 'fire123 *', 'phone': 9879098, 'Skills': '-Python -SQL -Django -Rest Framework -JavaScript'}] Explanation - We have imported the yaml and its Loader to the reader the YAML file in the above code. The load() function comes with the four types of Loader.
The load() method returned the generator object that we type cased into the list and could access any element. We can also get the same values in the form of a dictionary. Let's understand the following example. We can also get the yaml values in the form of dictionary. Let's understand the following example. Example - 2 Output: {'UserName': 'Antonio', 'Password': 'fire123 *', 'phone': 9879098, 'Skills': '-Python -SQL -Django -Rest Framework -JavaScript'} We changed the scalar argument SafeLoader to FullLoader that converted the YAML data into the Dictionary. The advantage of this loader is that, we don't need to type cast the loaded data into list. Read Multiple YAML DocumentWe can read the multiple yaml document using the yaml.load_all() method. A single YAML file can have multiple documents. Below is the example of multiple documents in single file. sample.yaml The document starts with three dashes (---) and ends with three dots (…). Let's understand the following example. Example - Output: [{'UserName': 'Antonio', 'Password': 'fire123 *', 'phone': 9879098, 'Skills': '-Python -SQL -Django -Rest Framework -JavaScript'}, {'UserName': 'Maino', 'Password': 'fire123 *', 'phone': 9879098, 'Skills': '-Python -SQL -Django -Rest Framework -JavaScript'}, {'UserName': 'George', 'Password': 'fire123 *', 'phone': 9879098, 'Skills': '-Python -SQL -Django -Rest Framework -JavaScript'}] Explanation - The load() method returned the generator object that we typed cased into the list so we could access any element. In the previous examples, we learned how to read the YAML file. Now we will learn how we can dump data into a YAML file. Write YAML File Using PyYAML ModuleWriting the Python data into YAML is known as serialization. To dump data into yaml file, we will use the yaml.dump() method. Let's understand the following example. Example - Output: Password: Xavier@123 Phone: 345464 Skills: - Python - SQL - Django - Rest Framework - JavaScript User: Zoey - name: Zaara occupation: Dentist Explanation - The dump() method transforms the Python objects into the YAML format and writes them into the YAML file. We have done same in the above example. The dump() method takes the two arguments - data and stream. The data argument represents the Python object that will transform into a YAML stream. The second parameter is a file that must be a text or binary file. The YAML stream data be written in the given file name; otherwise, dump() will return the produced document. Let's understand the example of writing Python data in the file. Example - 2: Output: NewDetails.yaml - User: Zoey Password: Xavier@123 Phone: 345464 Skills: - Python - SQL - Django - Rest Framework - JavaScript - name: Zaara occupation: Dentist Explanation In the above example, First, we defined the Python dictionary to be written in the file. Then, we opened the new details.YAML file in write mode. We used the dump() method and passed Python dict object with the two other tags. These tags are -
Dump Multiple YAML DocumentsThe yaml.dump_all() method is used to dump multiple YAML documents to a single stream. This method takes a list or generator producing Python objects to be serialized into YAML document and second optional argument as an open file. Let's understand the following example. Example - Output: Using dump() method - Password: Xavier@123 Phone: 345464 Skills: - Python - SQL - Django - Rest Framework - JavaScript User: Zoey - name: Zaara occupation: Dentist Using dump_all() method Password: Xavier@123 Phone: 345464 Skills: - Python - SQL - Django - Rest Framework - JavaScript User: Zoey --- name: Zaara occupation: Dentist Python YAML sorting keysThe sort_keys is an optional tag used while dumping the Python data into file. If we set it as True, It will sort all keys of YAML documents alphabetically. Let's understand the following example. Example - Output: import yaml from yaml.loader import FullLoader #open yaml file in read with open('sample.yaml', 'r') as f: print("Before Sorting?..") yaml_data = yaml.load(f, Loader=FullLoader) print(yaml_data) print("After Sorting......") sorted_data = yaml.dump(yaml_data, sort_keys=True) print(sorted_data) Format YAML FilePyYaml module provides the facility to format the YAML file while writing YAML document in it. The dump() method supports various formatting arguments. Below are the formatting arguments. Parameter -
Let's understand the following example - Example - Output: Password: fire123 * Skills: -Python -SQL -Django -Rest Framework -JavaScript UserName: Antonio phone: 9879098 Custom Python Class YAML SerializableWe can create the custom Python class that can convert the YAML into a custom Python object instead of list, or built in types. Let's understand the following example - Example - Output: Jessa queue@123 Custom Tags with PyYAMLWe can create the custom tags according to application requirements and assign a default value to custom tags while parsing the YAML file. To do so, it involves certain steps that are given below.
Let's understand the following example. Example - Output: [Custom Tags(user=Sam, password=new@123,phone=1100), Test(name=Gaby, password= admin@123, phone=5656)] Conversion Table in PyYAML ModuleBelow is the table that the PyYAML module uses to convert Python objects into YAML equivalent. The dump() method uses translation while encoding.
YAML ErrorsYAML parser raises an exception called YAMLError in case of any error. With the help of this error, we can debug the problem. So it is recommended to use the YAML serialization code in the try-expect block. Let's understand the following example. Example - TokensTokens are generally used in low level application applications such as syntax highlighting. We can produce scan() method to produce a set of tokens. Let's understand the following example. Example - Output: StreamStartToken(encoding=None) DocumentStartToken() BlockMappingStartToken() KeyToken() ScalarToken(plain=True, style=None, value='UserName') ValueToken() ScalarToken(plain=True, style=None, value='Antonio') KeyToken() ScalarToken(plain=True, style=None, value='Password') ValueToken() ScalarToken(plain=True, style=None, value='fire123 *') KeyToken() ScalarToken(plain=True, style=None, value='phone') ValueToken() ScalarToken(plain=True, style=None, value='9879098') KeyToken() ScalarToken(plain=True, style=None, value='Skills') ValueToken() ScalarToken(plain=True, style=None, value='-Python -SQL -Django -Rest Framework -JavaScript') BlockEndToken() DocumentEndToken() StreamEndToken() Python YAML to XMLThe YAML data can be converted into the XML format using the XMLPlain module. XML is an abbreviation name of eXtensible Markup Language which uses HTML tags to define tags. The obj_from_yaml() method is used generate the XML plain obj from the YAML stream or string. To keep the XML plain object element in order, YAML streams are stored as the OrderDict. Let's take the sample YAML file with the employee details and convert it into the XML file. Example - Let's understand the code implementation. Example - ConclusionIn this tutorial, we have learned some important concepts of the YAML and PyYAML modules. We covered how to create custom tags, loading the contents of a YAML file into our Python program as dictionaries. We have also discussed how to manipulate YAML formatted files. This tutorial is included quite a brief and basic functionality of the library. |