Attention!

The content on this site is a materials pilot. It represents neither changes to existing policy nor pending new policies. THIS IS NOT OFFICIAL GUIDANCE.

Data: The database


Iterative development

Ask how the state approaches security, performance, and migration testing.

Ask how project leads interact with the testing process.

  • Bad: The team cannot answer what types of testing they are doing, only that they test at the end of the process.
  • Meh: The team can describe their testing approaches but testing is done by a siloed team.
  • Good: The team can demonstrate their testing approaches. Testing and development is done by the same team.

Iterative development

What's this about?

An application involves both data and processes that operate over that data. Without the data, the application is nothing. As a result, how that data is organized, where it is stored, and who controls it all become critical questions in the lifecycle of a long-running software project.

Lesson outline

Databases (20m, solo)

A database organizes data. Databases can contain words, numbers, images… really, any kind of digital data. In the databases that state systems use, there will likely be text (names, phone numbers, etc.) and possibly some images (scans of identification, paperwork). The important question is less what kind of data is in a state’s databases and more how is it organized. Ultimately, our questions will lead us to the space of how movable is that data

The LEGO database

To get a quick overview of what a database is and how it is used, sit back, channel your inner child, and revisit the world of LEGO. Danielle Thé provides a concise overview of databases and SQL (pronounced “see-quell”), the language of databases. She does it with LEGO in 4 minutes.

Although short, it may be worth taking notes on terminology and the metaphors used here. If you’re working with states and vendors and they can’t break down how their data is stored and organized in a simple, clear way, there are problems.

Data, in CMS terms

Now that we have visions of LEGO people dancing in our heads, lets imagine we’re building a case management system for SNAP. Essentially, it’s an interface on top of a database system. We’re going to need to keep track of a lot of information:

  • People’s names
  • SSNs
  • Address(es)
    • Current
    • Previous
  • Dependents
    • Names
    • SSNs
    • Ages

The challenge for the vendor and state is what happens to this data when it’s time to update the application. There’s any number of common but painful changes that often need to be made, but vendors will claim it’s too difficult or to expensive to change. Consider these three examples:

  1. Many database systems in the US use a first, middle, and last name. This naming convention completely fails to acknowledge how people are named around the world. Spanish naming traditions might include the forename María or José, or even José María as a single forename. José María Álvarez del Manzano y López del Hierro has a compound forename, and two compound surnames (Álvarez del Manzano and López del Hierro).
  2. Not everyone has a fore and surname. A colleague in the NHS wrote an article for the BMJ titled The surname I do not have. “My name (one and only name) is Radhika. Until I got married, I was called M Radhika.” However, many database systems are written so that it is impossible to leave either the forename or surname blank.
  3. Native American naming traditions do not map directly onto a first/middle/last name. Pause and consider what it means for a federally funded system to continue, in the 21st century, to continue to disrespect and demean the lives of Native Americans.

It is important to build systems that honor and respect human beings as individuals. It takes intentionality and effort, but it can, and should, be done. A system that is incapable of honoring and respecting a person’s name is unlikely to support them as human beings throughout a benefits application process (for example). And a vendor or state that does not see that this is a problem, or claims their systems cannot be updated to capture these realities, is likely in need of substantial support and education.

It is true that changing something as fundamental as a name can be invasive and require changes throughout a piece of software. However, if it was well written from the start, then changes to the data and the interface will be manageable, tests will be executed to verify the correctness of the changes, and life will go on. Poorly written systems will fail in horrific ways. This is likely a space where vendors will insist on massive, multi-year contracts to do the work and a place where states should begin to rethink their contracting practices.

Questions about databases (10m, solo)

This contextualized example regarding names might help you appreciate the complexity of the data that is being managed by our systems. Unfortunately, if this data is organized and managed poorly, it becomes an excuse for lock-in. “Lock-in,” in this case, can stem from intentional choices, or it can be the result of poor data management over a long period of time. Your job, as State Officer, M.D., is to begin asking questions and helping guide your state to a place where this data is not managed poorly. Instead, we want data that is well-organized, stored in free and open source systems, and managed such that we can backup, migrate, and manage the data in reliable and repeatable ways.

The following questions begin to get at the heart of some of the challenges that large, complex information systems might have. There is definitely more you could ask, but these might serve as a starting point for conversation regarding data and the work that needs to be done to represent data as a living, changing thing.

  • Who “owns” the systems where the data is stored? The state, or the vendor? (Note: in a cloud context, “owns” might mean that they are using an Amazon cloud service, but the state has full control and access.)
  • Who manages the systems where the data is stored? The state or the vendor?
    • Here, we mean “does the state have the technical capacity to manage the database services?”
  • How are the databases in the state’s systems backed up?
  • Are backups tested regularly? How often?
  • Is it possible to automatically restore (successfully) from a backup?
  • When was this last tested?
  • What processes are in place to both verify and validate the integrity of a database restore?
  • In the case of a database crash, how long would we have to wait for systems to come back online from a restore?
  • Is the database layout (they’ll call it a schema) documented and understood by the software team?
  • What is the underlying database system being used to store the state’s data? Is it closed, or open? Why?
  • How difficult would be it be migrate the data from a closed system to an open system?

This second set of questions will make people nervous, and very, very likely to uncover some very, very disturbing truths about the systems in place.

Pause and reflect (20m, solo)

This is the first of two lessons on data and databases. Before proceeding, pause.

In your notebook, work through the questions from the section above. Assuming you are familiar with the work in your state(s), answer those questions to the best of your ability. It’s OK if you have to write down “no idea” as an answer to some of these questions.

Once you have taken stock, add some additional notes. Asking these questions directly, while fun, might set people unnecessarily on edge. Who might you have some conversation with at the state level who would have some insights in this space? What questions could you probe gently, so that you have a fuller picture of what is going on “on the ground?”

Reflecting on these questions will prepare you for the next lesson, where we talk about the migration of data, and ultimately come together in a group to talk all-things-data.

Sharing experience (30m, small group)

Meet with your small group and connect what you learned in this lesson to situations you’ve seen with your state projects. Consult the notes you took throughout the lesson and try to link them to a story that you can tell about a particular project. It’s probably useful to do some brainstorming on this before you meet with your small group to trade stories.

When you get together with your small group:

  1. Share your stories with each other.
  2. Figure out which ones are the best candidates for a case study or use case that would be helpful to share with other state officers.
  3. As a group, choose useful stories and write notes on how they link up with the concepts shared.
    • Include in your notes:
      • When did this story take place?
      • What were the events or background leading up to this story?
      • How did this story demonstrate an ideal or non-ideal situation?
      • What specific principles from the lesson does this story illustrate?
      • If the story shows an ideal, what were the conditions that made it work? How did it fit with the principles shared in the lesson?
      • If the story shows a non-ideal, what could have changed to make it better?
  4. Share these notes with the larger group when you meet.
  5. After discussion with the larger group, document these stories and their connections to the lesson to help other state officers understand how this lesson’s concepts apply to their work.

There is no full-group discussion on databases (yet), as this is the first of a two-part lesson.

In the guides

This lesson is the beginning of a journey. If you're interested in learning more, there's material in the 18F Derisking Guide that you'll want to check out.

From the Federal Field Guide:

From the State Software Budgeting Handbook:

Wrapup (5m, solo)

Take a few minutes to share your reflections on this lesson.