Possible changes in Noah

Possible changes to Noah

I've got some possible changes to Noah. These changes run the risk of being backwards incompatible so I wanted to do a brain dump of ideas

Background

When I originally thought of Noah, I had a specific use case in mind. I realized that because this was a semi-newish concept for some folks that I would need to create some "examples" that helped people think about where Noah might fit.

These examples were what formed Noah's primitives - Host, Service, Application, Configuration. My original use case has always been around use of Ephemerals more than anything else.

One thing I was painfully aware of pretty early on was that primitives painted me into a bit of a corner. Essentially the current system causes issues with expanding the namespace. Right now, the routes in Noah have the following "reserved" paths:

/hosts
/services
/applications
/configurations
/ephemerals
/tags
/links

While these are awesomely friendly urls, every new "route" I add to the system prevents those names from being used as a custom namespace for the linking capability. This is because any route not found in the system is automatically looked up as a link object before returning a 404. In essence this restricted you to either using Noah's predefined primitives (which have a contract) or using ephemerals (which were nothing more than key/value with triggers - no validation or anything). There was no in between. Again, if I decide to add a NEW object type to the system, continuing down this path eats into the namespace more and more.

Additionally, I've been thinking about ways to "simplify" the objects in the system. While I considered them primitives, I felt like there was something under the surface that felt like a more flexible system. Almost as if everything should have been an ephemeral with an optional schema on top.

My thought was that the system really needed a true primitive object with nothing but metadata to describe it. What this would mean is that the metadata for a "Host" would be a list of attributes and validations that made up a Host. My main concern was overcomplicating the system.

An idea from Jordan Sissel

Jordan has been talking to me for a while about his woes with a source of truth for his network. He wants a CMDB. He likes Noah's watch system but he needed something in between the existing primitives with all the validations and ephemerals with no validations. His idea revolved around schemas. This started to sound a lot like my metadata idea.

You can see the line of thinking he was going down here

A modest proposal

I spent some time thinking about it last night after a convo with Jordan and I have some ideas.

Convert existing objects in the system to some sort of schema/metadata based system

This would mean defining some sort of metadata syntax. Taking the case of the existing Host primitive, it would have something like this as a metadata/schema:

this is totally just an idea

{
  "id":"host",
  "attributes":{
    "name":"string",
    "status":["up","down","pending-up","pending-down"]
  },
  "validations":{
    "required":["name", "status"],
    "unique":["name"]
  }
}

What this essentially gets translated to is that a host must have a name and a status. Both are required. The name must be unique and status must be one of values in the array.

Now the biggest concern I have with this is that it feels "complicated" for what it does. Maybe too complicated. This is not a perfect example either, mind you.

When this schema is created, it inherits a few other capabilities:

Schema versioning
Record versioning
ACLs
Auditing
CRUD

along with the existing tagging, linking and watches.

Now the new objects can be created at the top-level url path under /host. You could conceivably even allow any attribute to be a filter under that path:

/host/myhost - a single record
/host/status/up - all with status up

Backwards compatiblity

Mind you Noah has not yet hit version 1.0 yet. Unlike most code in the Ruby world, I really don't want to break backwards compatibility in the middle of a release cycle. I've also already documented the existing system, presented on it, blogged about it and even rolled it out to my own production environment. This puts me in another pickle. How best to handle this new "way" of doing things?

Before I get into that, I am considering making some changes that would be transparent to the existing code base:

Path changes

Any "protected" path in the system will probably end up being prefaced with an underscore. This solves the issue of eating into available path space as well as simplifies the implementation of the ACL system.

_schemas - Schemas are stored here
_acls - ACLs are stored here

Other ideas:

_tags - Tags are stored here
_links - Links are stored here
_watches - Watches are here

Basically anything that gets mixed in to a object now (i.e. Taggable, Linkable), would get its own _ path in the system.

Back to compatibility

So now that we've got that part knocked out, what would be affected in terms of backwards compatibility?

Tags and watches

These paths could change to the underscored versions or I could simply say that /watches and /tags are reserved and leave it at that. This is probably what I'll do. The nice thing about underscore paths is that I can treat those as 'private' paths in the system and use the top-level equivalents as views.

Additionally, the postfix behaviour Noah currently has around tagging, linking and watching would still exist.

Existing primitives and ephemerals

This is where things start to come together. Jordan had the idea of schemas as packages. Someone might create an awesome schema around network gear that you want to use. What Noah would ship with by default is ALL of the existing objects as schemas in a bundle of sorts that could be entirely disabled or enabled. So with care in upgrading existing data at the point this change is made, that 'pack' would be enabled and everything is behaving as it did before.

Links will probably change

I never finalized how links would work in 1.0. I only had a general idea. It's important to understand what the original goal for links was. It addressed two issues:

Multi-tenancy
Information overload

The planned implementation was that links would operate similar to tags with the exception of everything under the same "link" being shifted down in the url path under the link name. So while top level urls were hosts, services, applications, configurations and so on any of those linked to say /my_company would have that structure duplicated under the path /my_company with just the linked records. The idea was then that you could operate in the same way on objects under the link name as the top-level ones. So you could call the existing verbs for /hosts except as /my_company/hosts and they would exist under there. Additionally you would be able to create a watch for /my_company and not have to do it PER object.

I happen to like Jordan's idea a bit more in that links now become the way to describe relationships between schemas (as opposed to hard coding it into the schema itself). This, however, does not address the multitenancy issue. I'm still trying to work this one through but I'm thinking that a new schema for an "organization" combined with Jordan's idea behind links could solve the issue.

Scorched earth

I could also take a scorched earth approach and accept the fact that because I'm not yet at 1.0 it's okay to change things this dramatically. I don't like this idea but OmniAuth just did it as well. Noah doesn't have anywhere NEAR the distribution that OmniAuth does so it's probably not that big of a deal. However, by attempting to maintain backwards compatibility I'm forced to think about how upgrades would happen between major releases now instead of realizing too late that I've done something uber inflexible.

Pros and Cons

I see a couple of big wins here. By moving to the schema approach, it becomes MUCH easier to handle the swap between persistance backends. I would only need to describe how to lay the generic schema bits out instead of each schema type.

Additionally, this approach gives Noah more room to grow. I've been loath to make changes to existing primitives and objects because everything was so fixed to the idea of an Ohm model. I also hated eating into top-level URL space with new objects. Yet people still wanted some flexibility to add custom information to the existing primitives. My answer was always "use ephemerals" but that felt wrong. Noah needs a way for people to extend it.

The downside is that, when starting to make this change, I run the risk of not being able to fit existing types into the schema mold exactly and HAVE to introduce a breaking change before 1.0. Also, this schema thing still feels a bit overly complicated. I think I can mitigate that quite a bit by knowing that schemas need to support the existing object types. In the end, nothing has to change if you don't want it to change - just use the baked in schemas.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly