Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Possible play #14: "build digital services on internal APIs" #64

Open
nsinai opened this issue Aug 27, 2014 · 12 comments
Open

Possible play #14: "build digital services on internal APIs" #64

nsinai opened this issue Aug 27, 2014 · 12 comments

Comments

@nsinai
Copy link

nsinai commented Aug 27, 2014

Consider something similar to Amazon's play: "All teams will henceforth expose their data and functionality through service interfaces"

More reading here:

http://apievangelist.com/2012/01/12/the-secret-to-amazons-success-internal-apis/

@seanherron
Copy link

I would go even further and argue that any time the government builds a system designed to accept submission of data, it should both build that system on APIs and make those APIs externally available as well.

@ErieMeyer
Copy link
Contributor

  • 10!

@matthewhancock
Copy link

I agree with the importance of planning for APIs and making them available publicly. There are a few potential issues worth observing and preparing for based on my experience with a former employer adopting a similar approach.

It added latency as the data retrieval process changed from:
Query Database > Process Data > Display Data
To:
Query Database > Process Data > Serialize Data > Transfer over Web Service > Deserialize Data > Process Data > Display Data

This should be trivial in most cases but with inefficient querying, tasks that I had to complete took seconds and sometimes minutes to wait for data. In some cases performance was limited by large amounts of unnecessary data being returned that required joins that slowed data retrieval down (e.g. retrieving all customer data when only a few properties were required). Others required multiple serial calls contingent on data retrieved from prior calls and/or multiple additional calls to retrieve paginated data.

Ultimately, this becomes an issue with fragmented and legacy systems that I'd imagine impacts government IT more than Amazon. There might need to be some level of requirements to ensure the APIs don't slow down performance (i.e. mandatory performance targets) or require additional capacity due to the retrieval and return of excess data. Additionally, there should probably be some group in between the data owner and data subscriber that is dedicated to reconciling issues and creating new APIs as needs change.

I don't know how relevant these points are in this case, but I thought they might be worth noting in this thread.

@lheyman
Copy link

lheyman commented Aug 29, 2014

Great points @matthewhancock. Perhaps include in this play a couple of "audibles" around inserting caching layer(s) or some other means of providing the service asynchronously where possible (see e.g. Facade Patterns), to either serve the data faster, eliminate (or significantly reduce) the processing step between the query and the serializing, or prevent the introduction of new, inefficient db queries.

Also +20hundred @seanherron!

@nacin
Copy link
Contributor

nacin commented Aug 29, 2014

Well, placing an API (even a rudimentary one) in front of fragmented and legacy systems is also a great way to decouple yourself from the pain of those legacy systems. Decoupling is a key play for development, maintenance, testing, and deployment. (It also works for sunsetting — i.e., you can swap out that legacy system as long as the API interface remains the same.) You're also not going to be repeatedly hitting the database for reads; you're going to use fast, portable serialization; you're going to want to design your data store with the API in mind if you can (so probably sans joins); and you're going to want to do intensive tasks asynchronously (if that's really the bottleneck).

I love this. Building digital services based on internal and external APIs = a huge win.

@nsinai
Copy link
Author

nsinai commented Aug 29, 2014

+1 @seanherron

@gbinal
Copy link

gbinal commented Aug 29, 2014

👍

Happy to help craft some language for a pull request.

@kinlane
Copy link

kinlane commented Sep 1, 2014

@matthewhancock your concerns are definitely valid—putting the urgency in the need to push forward with an API-centric approach. The sooner any legacy database systems is on the API road, and iterating, the faster you can move beyond this type constraint.

Once you start understanding how the API will be consumed, you can begin iterating away from the legacy perspective, and employing caching, and distributed approaches to deploying your architecture you can begin to alleviate the load on core database. In a mature implementation I think seeing "Query Database > Process Data > Serialize Data > Transfer over Web Service > Deserialize Data > Process Data > Display Data” would eventually go away.

To understand what is possible from folks who have been at this a while, take a look at what Daniel Jacobson from Netflix is doing, and their reworking of more resource based design to a more experience based design (http://techblog.netflix.com/2012/07/embracing-differences-inside-netflix.html). It is easy to think of how APIs are impacting the database, because you are seeing this from the database (aka resource) perspective, and not the experience on the web app, mobile app, and beyond. Once you start focusing on the experience vs. resource it starts flipping the paradigm, the reality of your API pipeline directly querying your database and putting on load will eventually go away. You will start distributing, and caching your resources differently and syncing with your core database either in realtime or some other schedule based upon system needs.

A simple Netflix of resource based v experience based design is: Resource you’d have /catalog/movie/[title] where experience you may go with /home/tv/movie/[title] or /home/ipad/movie/[title]. Requiring your content to staged, and delivered differently.

To understand what is possible from a more database way of thinking, consider historical data warehousing techniques. You wouldn’t be running complex reporting off your transactional database? you would cube or put them into warehouses, reconstructed just for the reports. This is similar to what we are seeing with API design, and your data storage, and transmission needs will evolve once you start delivering your resource based API designs, and tuning into the experience of how people actually need the data / resources delivered instead of being dictated by the legacy database.

Hope that helps. Definitely a concern, but something we need to all work together to help each other evolve beyond, and the path forward will vary depending on resource.'

Thanks for shoutout on my AWS post Nick! ;-)

@dsmorgan77
Copy link
Contributor

To accomplish this, you can just revise play 13 a bit. Rather than focusing on providing APIs to the public or third parties, just generalize those statements by eliminating references to "public" and "third parties."

No need for a whole new play.

@kinlane
Copy link

kinlane commented Sep 2, 2014

I am happy to craft two new posts similar to the Amazon one that addresses internal government agency agility, flexibility....and one that would apply inter-agency v intra, and what it can do for improving the how gov works together and partners.

@ErieMeyer
Copy link
Contributor

Please do, Kin. Would love to see that.

@alyraz
Copy link

alyraz commented Nov 6, 2014

+1M @seanherron & @nsinai. I think this is an excellent suggestion and vote for dogfooding to be a whole new play. I am in midst of advocating for an API in the new Rec.gov contract and the biggest pushback we've received is that "adding" an API means additional costs, and this may dissuade bidders. If procurements required dogfooding then this argument would be null! You're got to build the functionality anyway.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests