An Insider View of the Salesforce Architecture

An Insider View of the Salesforce Architecture

how are you all today yeah everybody doing good are you having a good time at trail hey Dex yeah awesome my name is Ian Varley and I am here today to talk to you and give you an insider view of the Salesforce architecture this is the real shizzle y'all this is where we're gonna to have the real talk about what's going on under the covers to make Salesforce go you know y'all ready y'all psyched yeah all right good ok so I'd like to let you know about some forward-looking statements they'll be making you've seen this about 300 times today already I'm sure but don't make any purchasing decisions based on what I'm gonna say and I mean that quite quite seriously ok so Who am I my name is Ian Varley and I'm a software architect at Salesforce I've been at Salesforce for about seven years and my team is called architecture strategy our goal is to clarify and simplify the architecture of Salesforce nice simple goal much harder than it sounds I live in Austin Texas anybody here from Austin Texas anybody yeah Austin represent what's up it's a it's a balmy 300 degrees in Austin today so I'm glad to be here and this is me on Twitter the future Ian or the future Ian depending on how you pronounce it alright so what is our topic today our topic is everything not not really everything but everything related to architecture I'm gonna try in 30 short minutes which I think maybe I get a couple extra minutes because the timer hasn't started yet oh god I started it uh-oh well okay in 30 short minutes 29 minutes and 54 seconds I'm going to give you a top-to-bottom view of what makes Salesforce tick so I'm gonna cover just some basics real quick to make sure we're all on the same page to start with then I'm gonna talk about the overall system architecture let me talk about the kernel of the Salesforce CRM core application we're talking about the way the data is structured this is a lot to cover in 30 minutes and I apologize because this is gonna be a little bit like drinking from a firehose and I'm sorry about that but such is life but we'll say if you have questions that remain unanswered at the end of my talk I would love love love for you to come and talk to me right afterwards where we can go get coffee I want to hear what you have to say and I would like you to ask me it really hard questions cool yeah yeah good okay I'll start with the basics as you may have heard here and their trust is probably the most important thing that we do as a company because you're entrusting us with really important data and so everything we do is is the paramount thing we think about is the security the availability the correctness of your data and and the systems that you use on Salesforce so that's like the number one thing we worry about right but and it's a big but we also operate at web scale right we operate at a very large scale billions and billions of transactions every single day so the unique thing that Salesforce does in the industry is the it's the combination of those two things right Enterprise Trust at web scale right that's that's our mission and that's everything in this architecture that you're gonna see supports Enterprise Trust at web scale so the basic premise of Salesforce as you've heard a hundred times software-as-a-service we run it you use it isn't that wonderful now technically speaking that's that's a lot more important to you than you might realize because that means all of the the dumb stuff about software maintenance we do it right so database upgrades and capacity and troubleshooting right I sometimes I sometimes joke that our slogan is Salesforce we work with Oracle support so you don't have to yeah just kidding just kidding Oracle okay so so the other big concept in in Salesforce world is multi-tenancy now you might be filming familiar with web applications that are where you have personal data online like your banking system and you're probably also familiar with web-based systems where everybody shares common data like Wikipedia or social networks multi-tenancy is like somewhere right in the middle right where you have groups of users who share data with each other but not other groups of users so that's like your company everybody in your company is working on a shared pool of data but never the twain shall meet with other pools of data from other companies that's extremely important that's the most critical security boundary within the system your data while it is well well this well it shares hardware shares software with other tenants right running at the same time that software has to keep all that stuff separate from tenant to tenant so that's fundamental importance okay so I just wanted that's that's the basics I just wanted to kind of make sure we're all on the same page are we all on the same page show of hands yeah oh right awesome okay so let's talk about the system from a very high-level perspective so you can think of Salesforce as one massive distributed system we're gonna start from the bottom up but but think about this as a sort of a guided tour of the various levels of scale of Salesforce so starting at the very bottom I already talked about tenants right you're inside of a ten it's a logical construct but then outside of that that tenant when you work with it and make requests to it those requests are running in multiple different services application services data storage services I'm going to talk about this a little bit more but then those services are then grouped into instances and I say started here and I'll explain what I mean by that but basically those services are all in these independent instances and then there's these groups of these instances which we call properties which are architectural a diverse parts of Salesforce like Heroku is architectural diverse from Salesforce core right so you've got tenants inside of that use services which are inside of instances which are inside of properties all of which run in real physical data centers that have spread on many many thousands of servers all over the world to make this one big product that is distributed around the entire globe so that's a fairly complicated thing to manage inside of a singleton I already talked about tenants 101 talked about this too much but basically the way we logically separate that data is using identifiers so it's not on separate VMs it's not on separate Hardware the tenants just use different identifiers but they are stored in the same database tables running on the same servers in the same processes side-by-side with each other right now I mentioned services we have a lot of services now you might say is this a micro services architecture maybe we have we do have a lot of services we like to think of it a little bit more like a mezzo service architecture Steve Tam came up with that word and and I'll explain a little bit more about what I mean by that but yes there are a lot of different software services physically running Damons on servers in containers and on physical hardware running in our data centers that do things like data storage run business logic and do caching and search indexing store big blobs of data and files do file conversion type processes from PDFs to other kind of data and things like that and then on top of that we also have you know services for machine learning and queuing and event subscription and feature flags and basically tons and tons of different discrete software services which is great okay and there's a lot of these now and I should I should note when I say service I don't mean one process running on one server typically most of these services are actually little fleets of services behind a load balancer or in a cluster or something like that okay so now there's two special services that I'm gonna come back to in a minute okay so hold your horses but those are kind of the underpinning of the sales cloud service cloud communities cloud and platform clouds and this is the application server and the database server I will come back to those I promise but those are sort of the special big ones now the services are grouped into instances what are instances instances are essentially separate sets stacks of hardware right so server database servers application servers etc etc right we have a bunch of those I topping 100 at this point just within the CRM course stack alone and the main reason we do instances there's a few reasons the main one is for scaling right we'd like to be able to scale out and since tenants are logically separate we can shard them into different instances and that's wonderful right it's good for us and it works for everybody right we also do it for fault isolation so if there is some kind of hardware problem or something that isn't handled by the software the blast radius of that is smaller all right we also do it for geographic latency so for example for our customers in Japan the instance that serves them is in Japan so that their ping time is a lot shorter okay so there's a bunch of reasons why we do that now outside of that of course we also have as I mentioned properties so there's a Salesforce is grown in a variety of ways it's growing organically over the years it's also go in via acquisition and so we have different stacks for different kinds of purposes and and mainly what this does is it allows us to make different kinds of architectural trade-offs regarding things like data caching and you know graceful degradation and things that might really need to work differently for one type of application than for another okay so you can see a few of the if you'd go to trust force calm you can see there's a few of the different properties we've got right so you know trust calm and trust marketing cloud comm and so forth so this is an indication you can see those different little deep level architectural stacks or properties now you're probably sitting here thinking okay but like this is all pretty abstract like let's get into some real details about like CRM and in core and how it actually works cool let's talk about the kernel so I'm gonna be talking specifically as I said in this next couple of sections about what we call CRM core that is sales cloud service cloud marketing cloud not marking cloud we need that and the platform okay so I mentioned the two big services one of them is the relational database so if you think about it like like two bodies in orbit the relational database is the core of each of our instances like n a 1 n a 2 and a 3 that's that's a relational database that is running as kind of the center of gravity with lots of other services that connect to it and you know sort of depend on it as their primary system of record and then we have a whole bunch of application servers of different types that you might say orbit this this one database right in a relationship with it now there are dozens of other services that interact with these two has already said but but these two are the ones I'm gonna kind of deep dive into for the next few minutes so one is the kernel so the kernel gives us a few things including an application container access control product defense a few other things and I'm gonna talk about these specifically so let's talk about the application container so that's when you make a request an API request or a UI request to Salesforce right that's gonna travel over the Internet and it's gonna arrive at Salesforce at the front door and then a bunch of stuff has to happen right so when it gets to us it hits a stateless pool of application servers stateless means these servers are essentially it doesn't matter if we if we lose one or another of these particular servers there's nothing physically stored of your data on these servers so if something's malfunctioning with one of them we can just walk up and shoot it in the head and everything's fine all the traffic just moves over to another server this is wonderful there's a very important property in application servers for them to be stay and it's fronted by a load balancer so that so that you know if one server is not behaving well a little bit slow the load balancer can redistribute that traffic to the other servers and everything's fine okay now on this server it runs a piece of open-source software called jetty which is a servlet container and that is essentially the the router that allows us to say okay based on what URL you're requesting or what your API request was what piece of actual code should be handling this right so if you said I want to update a record what can send it over to the code that does record updates and if you say I want to render a report it can send it to that code so it's essentially a big router it handles a few other things as well but that's kind of the main thing that jetty does for us and then within this there's a whole bunch of other systems that live in here for things like configuration and logging and debugging and all live within this application container and I'm gonna skip the login flow thing for a sake of time but that's the basic function here right the request path from I want something out of Salesforce to what do I get back okay that's the that's what the application container its job is okay now there's another important function within this kernel which is access control I already mentioned tenants right tenon is the boundary for data but you probably know inside of that boundary there's a ton of other access control as well right internal to the tenant so for example you can set permissions on who can actually you know create records or delete records and individual objects you can set field level permissions you can set row level sharing with lots of complex hierarchies this is a pretty complex subject but this is all handled by the application container right by this by this application server code so which is which is great some of the complexity here is such that if you can imagine you know there's there's a facilities within Salesforce where you can say okay well you can see this record if you're my boss or if you report to me or if your report to someone who reports to me or you know if you're in the division that this has been shared with or there's like lots of ways you can actually get access to data and calculating that on the fly all the time very very expensive to do from a data access perspective so actually we materialized a lot of that stuff when data changes at the time that it changes we say okay we're gonna actually spawn out a rep tation of who can access what at this time so like a lot of things in engineering that's a trade-off right we're trading off some space the amount of space that it takes to to store this and we're taking off some time when data changes in exchange for much faster access when you want to see who can see what all right now I want to talk a lot about product defense product defense and security is extremely oh I'm sorry actually I can't talk about this so we do a ton inside the product to secure your data and I am NOT going to tell you about any of it because this is a very public forum but if you have questions about our product defense from things like malware or you know various other types of attacks I'd love to talk to you about that one-on-one so please come and talk to me about it afterwards but it does involve pop-tarts okay so uh now next I want to talk to you about a thing called the uud has any one here ever actually heard of the ugh one one nobody Oh awesome this is like fresh meat all right so ID stands for universal data dictionary and what the utt is is essentially an abstraction inside this application server that allows us our engineering teams and you to define the metadata shape of your world in salesforce to create objects to create attributes field types all of the stuff that you need to do as far as defining your business in Salesforce all happens in this layer okay so it is a declarative entity model declarative what that means is the shape of this data isn't like a bunch of lines of code it's it's in it's in a format that just says this is the you know this this is the name of the entity these are the names of the attributes right and it's checked into source control when we do it or it's stored in the database when you do it but it's a very important point that it's a declarative entity model right and what that means is it allows the same interactions on internal and external platform developers so when you're writing in creating entities and we're writing and creating entities that are shared across all organizations are all tenants by and large it is the same set of facilities that you have for modeling data and for working with data right so that's that's actually a pretty important shared abstraction and that's a big part of the power of the platform the fact that you can do that okay so now internally didn't always used to be like that internally when Salesforce was started well you know it was a sales tool a lot of this platform stuff came along a little bit later and so what they did when they started the company was they said okay well we we've got an account object and we've got a contacts object and so let's just make an accounts table in a contacts table in our database and you know doesn't that sound nice and so they did that and then for a number of years a large number of years by and large when internal Salesforce engineers wanted to create something that every tenant would share became a physical table in the database now meanwhile we also have the ability to create custom objects right objects that you define and create and of course that doesn't actually work as creating tables in the physical database for a variety of reasons that I'll talk about a little bit later and so the the declarative access model the declared definition model is what allows those two things to coexist now after a while we were like you know we have this wonderful custom object stuff that you know our tenants can use why are we still creating physical tables for this stuff that we're doing so mr. Steven Tam who's standing in the back here wrote a few nice extensions that allowed internal engineers to basically do the same things that customers do so this is actually a case where the facilities that you had as customers for building on the platform were a lot easier for a long time than they were for internal engineers but we eventually got that parity and so now we have this thing called base platform objects which is what does that now in addition at runtime when you're actually serving data the utt is what functions as essentially an object relational mapping layer right between the data that's down there at the database level and what you see materialized in your org ok but and this is a this is a key point what's actually in the database looks very little like what you see coming out so you're used to going to a database and saying select star from some table and then that's what's actually in the table not how Salesforce works right because of this metadata layer all of that is virtualized right and I'll talk a little bit more about that when I talk about the database but essentially if you were to go and look directly at the database layer it would be without this application server as the intermediary it would be useless to you you'd have a lot of work to do to get it to work for you and so the analogy that I make is it's a little bit like you know going to the gas station and trying to pump crude oil into your engine like it's just that that's that's essentially what you'd be getting so now a brief aside is anybody here familiar with salesforce ID is the 15 character IDs do you know there's some structure inside of these IDs and not everybody knows that so I like to just put a put a point on it so there's a few different parts of this so the first three characters what we call the key prefix the key prefix is what identifies which entity right so if you were working in a standard relational database that'd be like which table or which object right which entity in salesforce is this and it's a three character identifier that's specific to your organization your tenant right the next bit is which instance remember how I said there's these hundred different instances that are these groups of services which one are you on now this this is a bit of a trick here because this doesn't actually mean which one are you on it means which one was this record created on because your tenant can actually move from instance to instance right this happens during instance refresh as it happens during organ migrations various other things right and so why is this there if it's not actually telling you which you know which instance you're really on the reason is for uniqueness right if if this weren't there there would always be a possibility of ID collisions of IDs created on one instance versus IDs created on another instance and we'd never be able to move those two organizations to sit on the same instance so this is just a thing that makes sure that no two records are ever this ever created the same right with the same ID so that's very important now the next two bits are reserved and and the reason that they're reserved it's really good that we have reserved bits and the reason for this or characters I mean and and the original there were three reserved characters right and this is a fun story there were three reserved characters and only one character since ID right now and any of these slots here can be filled by a letter lowercase or uppercase or a number right so that means there's 62 possible you know characters that can go in any one of these slots so what that meant is in the original design we only had room for 62 instances because that was the one character instance ID and when we got to about 30 or 40 instances we were like oh oh that's not good but thank God we reserved characters there in that part of the ID so we were able to change the code to use two instances for the character so now there are 62 squared right possible instances now when we get to 3844 instances we're gonna be in trouble again but we still have more reserved characters so don't worry it's okay and then the next part of course is the unique part of the ID right which is 62 to the eighth power so there's plenty of those to go around and now now here's a pop quiz for you does anybody know why sometimes salesforce ids are actually 18 characters and not 15 characters what yes it's right there there are there are some there are some cases on the internet with the API and various other things on the internet that are not going to preserve case specificity but as I already said these are case specific IDs so what would happen if you take this ID with a bunch of uppercase and lowercase numbers and change it all to lowercase it's a different ID right that's bad that's that's what we call bad and so so what we did was we threw on three extra characters which are a checksum of the rest of the case of all of the other characters in the ID right so by using that you can actually say I don't care what case they showed up and I know what case they're supposed to be so the case checksum so that's why you'll sometimes see 18 but you can always convert from an 18 to a 15 and back and vice-versa as long as you know the case of the 15 all right good job good job okay so now we only have a few minutes left here so I'm gonna I'm gonna go kind of lightning speed here but let's talk about the database so as I said write Salesforce models a virtual database on top of physical database which is kind of kind of cool right and it does a lot of the same stuff that a physical database does right it gives you a rich model of entities and attributes relationships and all that kind of stuff so when you work with the UI and the API that's what you see all right and this is entirely not true it's a fiction none of this really exists at the database level it's not fiction but you know none of this is really physically manifested at the database level okay so when you create an entity it does not actually do any DDL if you're familiar with DDL data definition language it doesn't create physical structures in our relational database what it does is it actually inserts a few rows and a few tables and then the system behaves as if there's a whole new entity there right now why is that why do we make that decision because generally speaking in a relational database when you do DDL when you change the definition of a table like add a column or really to call or delete a column it locks that in that that table while you're doing that which is fine if you're the only one using the database but this is an online system right this is all over the internet and we don't want to take any downtime when you you know the admin in your org adds a new field to a thing we don't want to lock the whole thing for 10 minutes while it updates the whole thing so that's why it's all virtual ok now interestingly there's some there's a lot of good and a lot of difficult stuff that comes along with that decision to make a completely virtualized database right so things that you would expect to happen just normally in a database like for example like foreign keys we can't use those right or unique indexes we can't use those because everything in the database is actually this like this virtualized thing and so all the stuff you know about how a database works you have to just throw out the window or actually more specifically we had to recreate all of that stuff so we do have unique indexes we do have foreign keys but they're all enforced at the software level not by the database now that's got some good points it means we have more database portability so we use one particular database substrate for a lot of our instances but we could we can and should and maybe well port to multiple databases and we can do that all because we don't really use this level of database stuff this is all in our software so that's wonderful right but now that also goes for indexes and a whole bunch of their stuff right so lots of fun for us as software engineers very interesting hard challenging problems as software engineers okay but as I said under the covers it is in fact a bona fide relational database okay so how do we actually do that at the scale that we do any given one of our so we use we run most of it on Oracle and most of what we like as a general rule each of our Oracle instances is running only in a neighborhood of a few thousand tenants okay simultaneously which is a pretty heavy workload to be asking for one physical database to do we do a bajillion tricks to make that work a bajillion that's a technical term so multi-level partitioning is one so we actually have partitioning within the database instance itself onto multiple physical servers with with a storage back-end within that we have partitioning on multiple physical files on disk right and a mapping between tenants and physical files and they have to have affinity of a certain node for caching and whole bunch of things like that right so lots of physical nodes and then there's this you know this sort of mapping between them and then each application server actually has a pool of database connections that are going over to the application server and we do a ton of tricks to actually make this work in practice and so for example we have actually created our very own database optimizer right so you know how databases internally when you send the sequel query to a database it has an optimizer to figure out how should i physically execute that query plan well you know when we use our actual database query planner it just gives us garbage because this is a multi-tenant system and it doesn't know anything about what's actually happening there so we write our own query plans and we enforce them using database level hints right we also everything called skinny tables which is essentially taking one dimension of a table and spawning it out for faster index performance all of this thankfully is mostly invisible to you right and that's part of the whole deal here that's part of why you're a Salesforce customer why why you get the power of developing stuff on top of this platform without having to be on the phone to Oracle support every day like we are this level of indirection is invisi you and that's super super important so one last thing that I'll mention that I think it's probably interesting is one way that the underlying database actually does kind of shine through in a way that you can see it is with database transactions anybody familiar with what a transaction is at the database level do a few people transaction basically means I can do multiple kinds of work in this database and then I can atomically commit or roll those back together right so I could say you know take a thousand dollars out of this account and put it a thousand dollars into that account and it's either both gonna complete or neither gonna complete now this is a facility that relational databases have had for decades and it's super important because if you don't have that all the people who build applications on top of your platform have to worry about that right they have to worry what if it failed after I took the money out of here but before I put the money in there and if you have a multi step process you gotta think about that with every permutation of those steps relational databases solve that with transactions and it turns out that when you use transactions in Apex for example right when you do multiple things within a piece of Apex code those are trans actually predicted at the database level right which is wonderful that that means you can actually say I'm gonna you know mutate a few records over here and a few records over there and if you're doing that with an apex code since it's all running within the same transaction on at the application server level then you get that database transaction Aled a for free which is wonderful it's also a bit of a challenge for us because it requires colocation of certain kinds of services that we would probably rather be able to split into different micro or mezzo services but that's our problem not yours so if you like this and you would like to see more I would like to inform you that this entire presentation has actually just been an advertisement for my blog series on medium so I write a series called the architecture files on medium and you can read about all these topics and many more if you subscribe and go in heart all of my articles and we've talked about sharding we've talked about event publication architectures the database is a magician is a favorite of mine lots of video game references so I'll of the stuff we've talked about here if you were like if you kind of got what I was talking about but like not maybe 100 percent go check it out on here and I bet you will have a better experience and and a deeper understanding and follow because we'll be continuing to publish ok so as I said I don't think we have time for questions like like my questions anything with that but I'm gonna step on the stage right over here and I would love for you to mob me and we can go get coffee we can talk about whatever you want so please come see me and thanks for your time [Applause]

4 thoughts on “An Insider View of the Salesforce Architecture”

  1. You can find a summary on this brilliant presentation here: Hopefully, this helps the Salesforce community!

Leave a Reply

Your email address will not be published. Required fields are marked *

Tags: , , , ,