 Hi, I am Sergei Golotsynski. I'm a software engineer at Johns Hopkins University and the Galaxy Core contributor. In this talk, I will give a brief overview of the current work on modernizing Galaxy's data model. Galaxy uses a relational database to persist objects and object relationships that handle its business logic. This includes users, roles, histories, workflows, et cetera, as well as numerous associations between them. The data model represents the object view of the data and serves as the abstraction layer that provides access to it. To map Python objects and the relationships between them onto tables and rows in the database, Galaxy relies on SQL Alchemy, which is a powerful SQL toolkit and object relational mapping. SQL Alchemy is an essential component and its role is not limited to providing the option of plugging in different database backends. It is well known that relational databases behave less like object collections, the more size and performance start to matter. At the same time, object collections behave less and less like tables and rows, the more attraction starts to matter. This is usually referred to as the object relational impedance mismatch. We need the specificity of the relational model, yet we also need clean abstractions. SQL Alchemy bridges this gap by using high level patterns and automating the repetitive tasks in mapping and persisting data. And at the same time, not concealing the underlying database schema and query design, thus exposing relational concepts to the development. As such, SQL Alchemy is an integral part of the Galaxy's backend. So why are we modernizing the model? Here's one reason. Galaxy's data model is complex. Today we have more than 150 model classes and more than 300 explicitly defined relationships between them. Making any major sweeping upgrades is a non-trivial endeavor. As a result, modifications have been mostly incremental. Over the years, patterns and libraries used by Galaxy's model have become outdated. SQL Alchemy's documentation has moved on and is largely based on more modern robust approaches. Overall, the data model is becoming hard to maintain and hard to modify. The other reason that triggered the move to modernize Galaxy's data model is SQL Alchemy's transition to its upcoming 2.0 release. This release is a major shift for a variety of SQL Alchemy usage patterns left over from its early development period under Python 2.3, a time when there were no context managers, function decorators, and many other language features that exist in modern Python. According to SQL Alchemy, the focus of this release is a modernized streamlined usage model that will be significantly more minimalist and consistent, as well as more capable. The transition of Galaxy's code base to adopt this release involves considerable modifications, which is a perfect excuse to revise and improve Galaxy's data model. These improvements so far include resolving the issues required for upgrading to SQL Alchemy 1.4, which is a transitional version, redefining the model using declarative mapping, which has been the recommended approach since 2010, and switching to Alembic, a database migrations tool, which is part of the SQL Alchemy project. Through these improvements, we expect to have a cleaner, more efficient, and reliable data model. And here's just one example of the improvements to the data model. In the current system, when we add a new model, we define the model class. However, that class definition does not include the attributes mapped to the fields in the database table. Instead, we define those attributes in a table class in a separate module. Furthermore, we need to explicitly associate the model class with this table object. We also need to define any relationships the model has with other models, with bi-directional relationships, often defined implicitly. All this information is spread across several locations, and as a result, the model may be hard to comprehend. Furthermore, this leads to potential bugs that happen when we accidentally overwrite an existing relationship. Like in this very example, the relationship between user group association and group is defined twice, once as members, once as users. In contrast, the updated system uses declarative mapping. We define the model, the database mapped attributes, and all the relationships all in one place. We reduce boilerplate, reduce code duplication, and we get a clean, concise model definition that's much easier to visually process. Furthermore, the new version of SQL Alchemy prevents many bugs from occurring. An overwritten relationship will trigger a narrow or a warning. So the bug demonstrated on the previous slide is unlikely to be introduced. As of now, the model has been updated to SQL Alchemy 1.4. In the course of the upgrade, more than 70 bugs and warnings have been resolved, including three issues reported to SQL Alchemy. There are several steps remaining, and we expect them to be completed this fall. We expect the impact of these changes to be subtle yet significant. A clean and more concise data model with many problematic and confusing behaviors removed leads to better cleaner code and fewer bugs. Naturally, this leads to an overall better user experience. Thank you.