How to Validate Image File Extension using Regular Expression in C++

In this example, we will discuss how to validate file extensions using regular expressions in C++ with several examples.

Introduction:

Image file validation is a very important task in many applications, especially when we deal with user uploads or external data sources. Validating image file extensions makes sure that only allowed image types are accepted, reducing the risk of security vulnerabilities and ensuring data integrity. Regular expressions provide an advanced methodology to define patterns that can facilitate programming-level validation for image file extensions.

Problem Statement:

When working with files in a C++ program, it is a common requirement for the development team to check if the provided file extension is valid while accepting an image for processing or storage. It can become time-consuming and subject to mistakes when extending the list of allowed formats. Additionally, if a team follows a naming convention that allows combinations for a file's base and extension, using regular expressions is a more efficient way to define rules for multiple file extension styles, for example, jpg or jpeg, and then validating the file names against these.

What is a Regular Expression?

Character combinations in strings can be matched with regular expressions. The regular expression template library in C++ includes a number of functions for regular expressions. Here are the step-by-step instructions for using regular expressions in C++ to check the image file extensions:

  • Define Valid Image File Extensions: This is the first step. State what types of images have a valid extension that our application supports (.jpg, .png, .gif, .bmp,...).
  • Create a regular expression pattern that will match any of the valid image file extensions. The pattern should use the '|' operator to specify multiple extensions and escape special characters like '.'.
  • Compile a regular expression: Create a regular-expression pattern using std::regex from C++'s <regex> header file.
  • Validate file extension: If a file name is passed, check if the file extension matches the compiled regular expression pattern using the std::regex_match.
  • Handle a Validation Result: According to the result of validation, deal with the file accordingly, accepting it or declining it.

Program 1:

Let us take an example to demonstrate how to validate image file extension using regular expression in C++.

Output:

Valid image file extension.

Explanation:

  1. In this example, we define a function validateImageFileExtension that takes a fileName as input and returns a boolean indicating whether the file extension is valid.
  2. After that, we construct a regular expression pattern validExtensions that matches common image file extensions.
  3. Next, the regular expression is compiled using the std::regex.
  4. Function below will test the validation function on the example file name "example.jpg" inside the main function.
  5. Based on the validation outcome, it displays either "This is a valid image file extension or This is an invalid image file extension".

Program 2:

Let us take another example to demonstrate how to validate image file extension using regular expression in C++.

Output:

Test Case 1: 1
Test Case 2: 1
Test Case 3: 0
Test Case 4: 0
Test Case 5: 1

Explanation:

1. Header Inclusions:

The program contains the header files below:

  • <iostream> for input and output operations.
  • <regex> for regular expression support.

2. Namespace Declaration:

  • The using namespace std; statement allows the use of standard C++ library functions and objects without prefixing them with std::.

3. Function Definition (validateImageExtension):

  • This function takes a string (fileName) as input and returns a boolean value indicating whether the file extension is valid or not.
  • It uses a regular expression pattern to define valid image file extensions like jpg, jpeg, png, gif, JPG, JPEG, PNG, and GIF.
  • The function checks if the provided file name is empty. If it is, it returns false.
  • After that, it uses regex_match to match the file name against the regular expression pattern. If the match is successful, it returns true; otherwise, it returns false.

4. Main Function:

  • The main function is the entry point of the program.
  • It contains several test cases to demonstrate the usage of the validateImageExtension
  • Each test case involves providing a file name to the validateImageExtension function and printing the result (true or false) to the console.

5. Test Cases:

Test cases include various file names with different extensions:

  • Test Case 1: Valid image file name with the extension .png.
  • Test Case 2: Valid image file name with the extension .jpg.
  • Test Case 3: Valid image file name with the extension .gif.
  • Test Case 4: Invalid image file name with the extension .mp3.
  • Test Case 5: Invalid image file name with the extension .jpg, preceded by a space.

6. Output:

  • The program writes the output of each test data to the console. It prints result 1 if it is valid; otherwise 0.

Uses:

There are several use cases of Image File Extension. Some main uses are as follows:

  1. Data integrity: Assurance checks that no unauthorised data is modified or deleted and that data entered or received by a system conform to certain standards or rules are examples of validation.
  2. Security: When we properly sanitise inputs to prevent malicious code or unexpected characters, input validation minimises security threats, such as injection attacks (for example, SQL injection or cross-site scripting (XSS) attacks).
  3. User Experience: Validation improves the user experience by giving us real-time feedback on the input errors; if we validate the data on a client-side (yeah, web form!) of a web page, we will get immediate feedback and avoid excessive frustration. Hence, a lower rate of user errors.
  4. Compliance: Standards and regulations in some industries require data-validation rules to ensure compliance. For example, financial institutions validate customer information to be AML-compliant (AML means 'Anti-Money Laundering').
  5. Validation: Data accuracy is one of the essentials of data validation. It ensures that data is indeed safe to run the automation and analysis. In other words, organisations can be confident about the accuracy of their data-driven decisions by validating data against predefined criteria or business rules.
  6. Error Prevention: Validation reduces errors in data entered by users by enforcing input rules and constraints. We might have noticed that email addresses are validated when we enter them, ensuring that users have entered correctly formatted addresses and reducing the likelihood those such addresses will be excluded from a mailing list.
  7. System reliability: Validating input data prior to processing it prevents crashes or hangs of a computer system due to inputting unexpected or erroneous data. If errors can be caught early on in a process, the outcome/output of a system can be more reliable.
  8. Quality assurance: In a quality assurance process, ensuring that software runs as intended under a set of possible conditions and inputs is vital. Automated tests can be run to prove that certain kinds of data and functionality are correct. Validations catch bugs and other issues before end users encounter them.

Conclusion:

In conclusion, we can use a regular expression to validate the file extensions of images in C++, reducing the complexity of file extension checks (the programmer has to compare only one value, not two) and making the application more robust. Regular expressions can be used very flexibly and effectively to define patterns and to search for matches for patterns. Hence, patterns are the key concept in regular expressions.