Symfony Internals #2: Data Validation

Is Symfony using some sort of magic to validate inputs?

Alexandre Daubois
The SensioLabs Tech Blog

--

Photo by SIMON LEE on Unsplash

This article is the second part of the blog serie Symfony Internals, which dissects the internal workings of the framework and its components. Whether it’s to contribute to Symfony development or for your personal knowledge, you’ll find here a vulgarization of the solutions implemented within the internal classes of Symfony.

One of the first thing you learn when getting into application development is to never, absolutely never trust user inputs. It means that you must always check and validate the format the data are coming to you and never assume they are valid. Said differently: always assume data you’re receiving are invalid and that you must perform verifications. Don’t get me wrong, it isn’t only for users of your applications: Symfony is always validating the data you pass to the framework (in your configuration files and in service declarations for example). I wrote a bit about it in the first article of Symfony Internals.

Wherever you need to do this, it is a tedious task. Roughly it’s always the same process, the same validation rules. Luckily enough for us the Validator component of Symfony (which you can use in your own PHP project as it is a standalone library!) will help us to write clear constraints and will fit most of your use cases. If not, you can write your own validation rule and use it the Validator component.

On a side note, the Validator component is based on the JS 303 specification.

The kind-of-entrypoint of this component may be the ValidatorInterface. It declares a validate method, which enables you to validate a value against a set of constraints. It returns an instance of an implementation of ConstraintViolationListInterface, which is likely to be ConstraintViolationList.

ConstraintViolationList implements many interfaces like the standards Traversable, Countable and ArrayAccess. Thanks to this, you’re able to use objects of this type like an array: loop through them, count elements, array item access syntax, etc. If you’re into Iterators, this class is also implementing IteratorAggregate and using an ArrayIterator. That’s pretty good news if you have gigantic sets of data to validate!

This list holds violations as multiple instances of ConstraintViolation (one by violation). There’s no complex logic in there, and it is some kind of a model to hold all information about the violation: the path (i.e. the property that caused the violation for instance), the violation message and message template if it has parameters, etc.

Alright, so we have a Validator returning a list of violations. We have the input, we have the output. So, what’s happening in between?

When validating input with the Validator, you’re actually using a RecursiveValidator. And that’s super cool because thanks to this, you’ll be able to validate deep data structures like objects in just one call.

At the validation call, the RecursiveValidator will start an execution context. What the point here? Starting a new context at each execution allows you to use the same Validator instance and separate violations from each other from a validation to another. Indeed, it would be disturbing to have violations adding indefinitely to the same list at each validation. Also, it wouldn’t be very cool if the list of violations was cleaned up each time a new validation is done. So we separate each Validator execution. You may know understand why it is named an ExecutionContext.

Once the execution context is started and an internal RecursiveContextualValidator is created, we can call the validate method of it. Remember the validate method? I said you’re able to pass it a set of constraint (or only one, but a set begins with at least 1 element, right?). Two possibilities here:

  • You passed a constraint to validate, your data will be validated against it ;
  • You didn’t pass any constraint. In this case, the RecursiveContextualValidator will fetch the constraints defined for this object’s class and try to validate it against the constraints. Note that if the data you passed is an array, it will try to validate all elements of this array.

In the first case, that’s great, we already know which constraints to use. In the second case, we must find these constraints thanks to metadata and constraints mapping. This mapping can be done with annotations or PHP 8 attributes, but also in YAML, plain PHP or XML files.

Our contextual validator will use the so-called MetadataFactory in order to fetch these informations and return them as instances of ClassMetadata. Internally, the MetadataFactory will use an implementation of LoaderInterface, which is used to actually load validation metadata from a class. And that’s where the multiple mapping formats available (listed above) are concretely implemented and supported. There is one loader for each format. We can find exhaustively:

  • DoctrineLoader (part of the Doctrine Bridge, not of the Validator component) ;
  • YamlFilesLoader ;
  • XmlFilesLoader ;
  • PropertyInfoLoader ;
  • AnnotationLoader ;
  • StaticMethodLoader.

Additionally, you can find a LoaderChain, which allows you to use multiple loaders “at once” to load metadata.

Once our RecursiveContextualValidator has all the constraints that needs to be validated, it’ll loop through them and collect all validation exception that will occur.

We have a pretty complete path here! But there’s a last thing missing: how are described constraints and how is the value actually validated concretely?

All validator constraints consist of two parts each: the Constraint and the ConstraintValidator. The first one is the model that defines the name of the constraint, its error message, its available options, its error code and so on. The ConstraintValidator implements a validate method which receives the value to validate. Finally. How does it look like? Let’s take the example of the Blank constraint which is short:

And here is its corresponding validator:

Note in annotation and attribute declaration that the constraints can be applied to properties as well as methods and are repeatable. Also note the use of the execution context we spoke about earlier! If you go the source code of Symfony’s Validator component, you’ll find out that all constraints are defined this way.

Our RecursiveContextualValidator receives a set of Constraints. It then uses a ConstraintValidatorFactory to guess the corresponding validator and execute it. The execution is filled with the violation if needed. Then everything bubbles up to the developer that can process its list of violations. And… That’s it!

Big chunk right there, uh? I purposely excluded some concepts from the validators like the validation groups and the cache, but this will not block you to understand the internal workings of the component.

It goes without saying that the next time you implement your own validation constraint, many things will seem much more obvious now!

--

--