Pandas Concatenation

Pandas is capable of combining Series, DataFrame, and Panel objects through different kinds of set logic for the indexes and the relational algebra functionality.

The concat() function is responsible for performing concatenation operation along an axis in the DataFrame.

Syntax:

Parameters:

  • objs: It is a sequence or mapping of series or DataFrame objects.
    If we pass a dict in the DataFrame, then the sorted keys will be used as the keys<.strong> argument, and the values will be selected in that case. If any non-objects are present, then it will be dropped unless they are all none, and in this case, a ValueError will be raised.
  • axis: It is an axis to concatenate along.
  • join: Responsible for handling indexes on another axis.
  • join_axes: A list of index objects. Instead of performing the inner or outer set logic, specific indexes use for the other (n-1) axis.
  • ignore_index: bool, default value False
    It does not use the index values on the concatenation axis, if true. The resulting axis will be labeled as 0, ..., n - 1.

Returns

A series is returned when we concatenate all the Series along the axis (axis=0). In case if objs contains at least one DataFrame, it returns a DataFrame.

Example1:

Output

0       p
1       q
0       r
1       s
dtype: object

Example2: In the above example, we can reset the existing index by using the ignore_index parameter. The below code demonstrates the working of ignore_index.

Output

0       p
1       q
2       r
3       s
dtype: object

Example 3: We can add a hierarchical index at the outermost level of the data by using the keys parameter.

Output

a_data    0      p
          1      q
b_data    0      r
          1      s
dtype: object

Example 4: We can label the index keys by using the names parameter. The below code shows the working of names parameter.

Output

Series name   Row ID
a_data         0    p
               1    q
b_data         0    r
               1    s
dtype: object

Concatenation using append

The append method is defined as a useful shortcut to concatenate the Series and DataFrame.

Example:

Output

     Name      subject_id     Marks_scored
1    Parker     sub1           98
2    Smith      sub2           90
3    Allen      sub4           87
4    John       sub6           69
5    Parker     sub5           78
1    Billy      sub2           89
2    Brian      sub4           80
3    Bran       sub3           79
4    Bryce      sub6           97
5    Betty      sub5           88





Latest Courses