CraigMattson.net

Building a Synchronised Christmas Lights to Music Display - Part 2

02 October 2021

Woah! Never expected that Christmas Lights would take up as much time as it has so far. I wasn't under any illusion that it wouldn't take time - but as I write this on 2nd October I'm reminded that I'm just over a month away from hanging stuff up on the house and running my first synchronised light show.

First - the props. In my last post, I left you with a picture of my 3D printed singing Christmas Tree and a view to make some new props. I started working on a similar layout to the singing Santa from Boscoyo albeit at a suitable size for the space it'll be mounted in. The process was similar to the Christmas tree - eyeball the layout from xLights, reduce the size to the space I wanted to occupy (around 60-70cm square) and draw in new 12mm holes for the bulbs to sit in.

The Singing Santa

Figure 1 - Drawing out Santa with a 30mm width.

I must admit, he looked a little creepy in this one. Some things I learned from the previous print is to try and keep bulbs at notable points. While you won't see the plastic once printed and lights are up, not having the lights right in the corners and edges means the light up will look slightly off. Also - Santa needed a red hat and white outlines for this setup so while I didn't end up printing out the plastic in white or red, these would roughly serve which lights will light up those colours.

Figure 2 - Putting Santa into 200mm x 200mm sections for printing.

This one was a little more challenging, especially with the mouse piece just breaching 200mm (although using some angling, eventually got this to print. The downside of this was the lack of supports around the beard. There's a lot of weight in the top half reasonably well secured. I also noted that for at least one set of lights, I'll need to solder on a joining wire between parts to ensure we've got enough distance to wire him up.

These props will be mounted to a PVC pipe frame so I'm not ultra worried about it - but if I were to design this one again, there'd be two supporting brackets either side of the beard, up to the side of the mouth, and a couple of smaller chunks between the mouth and nose.

You'll note that the success of the 'train track' join style, I've used that again here as it's been a good way to join plastic together with a solid fit without removing much plastic.

Figure 3 - Printing in Progress

in the lead up to Santa finishing printing, my delivery from Amazon arrived - my 1,000 lights that I'd need to cut, splice and add pigtails to that I'd ordered from Light it Up LEDs. Now I must admit I wish I discovered Light it Up LEDs much earlier than the Amazon order - the bulbs were a little cheaper, and he had LEDs with the connectors already soldered on. This would have been a tremendous time save, but none-the-less I had these bulbs and I had to cut off odd numbers of bulbs. For this guy, he is 203 bulbs so ... 4 x 50 + 3 from another strand. Another learning here to try and get multiples of 50. For reference, the Christmas Trees are 156 bulbs. If I could reduce (or increase) the density, it would have made wiring easier.

Figure 4 - Light Bulbs arrived!
Figure 5 - Pigtail Connectors arrived!

I don't have a lot of pictures of me soldering wires away, but it's fair to say I've soldered a good couple of hundred pieces of wire together. I've got pretty good at it too. But there's one thing I kept missing... the bloody heat shrink. I really need to print this out and frame it, frankly.

Figure 6 - The meme I really need framed.

Anyway... with pigtails attached, wires spliced with extension wires and some waterproofing done (silicon around the exposed connectors), it was time to turn Santa on. And wow - it looks great.

Figure 7 - Santa lit up and connected to xLights
Figure 8 - Santa singing 'blinded by the lights...'

Now the eagle eyed of you may note there's a hole with a bulb missing! Yep - I missed one bulb. And yet... I didn't have any leftover bulbs. I'm not even sure how both errors lined up. In any case, always remember to count your bulbs. Sorry Santa, but you'll probably be replaced next year.

The Candy Cane

Ok class, time for a bit of Math. If I have 1,150 bulbs, Santa consumes 203 and each singing tree takes 156 bulbs, how many do I have left? Well, of course the answer is 635 left that I need to distribute between 5 mini trees + stars and 4 candy canes. Each mini tree would be around 90 bulbs (70 for spokes, 20 for the star) so that leaves around 185 bulbs left to use. That leaves around 40 bulbs per Candy Cane left, with some spares for repairs I might need to do along the way.

So I designed a 39 bulb Candy Cane. It'd have been nice I think to add another strip in the middle of them but ... these props will probably be replaced next year anyway. So out comes good ol' Sketchup and I start working on a design.

Figure 9 - Candy Canes in Sketchup

A pretty simple design that in future I'd probably have put some holes in to allow for cable ties to go in the middle rather than around the whole prop. But for this year it should be fine. To make them look a little decorative, I'd alternate white with red ABS. I certainly have plenty of filament so might as well make use of these colours.

Figure 10 - Trial Prop - all in red.

After a day of printing the parts out, I have a good idea on how this is going to look. After wiring a few bulbs in we have some success. 

Figure 12 - Ready to go!

Honestly, having put the pixel tester in and saw it light up - I was a bit underwhelmed. Really needed that middle strip I think but we'll go with the two lines and see what happens. I'm sure I can make some decent effects on it and you'll be seeing it from the footpath anyway. So I digress...

With this one out of the way, it was time to set up a bulk print order...

Figure 13 - Bulk Printed Candy Canes
Figure 14 - Assembled Candy Canes

Seeing these fully assembled though - I'm pretty happy with how they look. And most importantly, my son loves them. There's something in the reflectiveness of the white that is intriguing to him.

The Mini Tree (with Star)

So the star of the show (ha ha) would be some of the mini trees .They'll carry most of the beats and the more interesting displays but they'd also be the most difficult to print. This is perhaps where a local supplier of Coro material to do it would have been a little more effective but I figured I could print a base with the star to mount to, and have holes for 'spokes' to sit in. I'd use tent pegs to hold the legs in place which gives me a bit of an opportunity to perfect the placement of these bulbs at least for the first year. It's fair to say these'll probably be replaced as well in future.

Figure 15 - The star and base, including one leg sitting in the slot.

The base is fairly solid. I'd discovered that a 25% Gyroid Infill was a good balance between a rigid structure and weighty enough. But I probably should have printed some sort of base for a PVC Pipe to hold it up. But that's OK... nothing a bit of Loctite won't fix... after all if it's good enough for my brake calliper bolts, it's good enough for my PVC pipe. The plastic cement stuff anyway.

These definitely produce an interesting effect and the adjustable spokes will be good for putting a little bit of personality into the display. Something a bit missing when you carbon copy everything several times.

Figure 16 - The Finished Tree on a Tripod.
Figure 17 - Tree lights are on.

These will definitely look great out on the lawn come Christmas time! Obviously there's some waterproofing to do and more pigtails to solder *sigh* ... but we're good to go on these ones. So queue the bulk print job for another weeks worth of stuff.

Figure 18 - Bases, Stars and Spokes.

Now I've got to say, all this 3D printing is burning through Filament fast but how much you say? Each of the larger props (Christmas Trees and Santa) were less than a 1KG roll each. With each ABS roll costing around $25 AUD, it's not the most economical product but it is allowing me to print things to the dimensions I want. For a lot of the Candy Cane stuff, Christmas Trees etc... there's probably around a role for the lot - but they are thinner prints and less plastic. But it does take a lot of time to do. The Christmas Tree bases alone - 4 of these in addition to the test tree, was say around 1/4 a role and took over a day. Each spoke needs to be printed 5 times, for 4 trees. They take over a day each and probably consumed 1/2 a role. The stars were similar - a day to do those and took around 1/8 a role. So expect around a week of printing, at least on my printer that runs around 30mm per second. I'm sure that more modern printers are able to run much faster these days and with the right infill models in place you can probably reduce a lot of rework.

Where to from here?

Well, more progress has been made. I've built a sample matrix and even found a use for some cheaper Chinese bulbs that I thought I'd trial. These have been done in the last two weeks, but this post is already 18 images deep so I'll leave it for a couple of weeks and showcase those. I've also got myself heavy into the sequencing game so there's probably at least 3 more posts before the actual show.

I must admit - this has been one of the most enjoyable creativity projects I've worked on in years. Being able to build something with your own hands and see it go is great. Watching your son full of enjoyment when the lights come on takes that to a whole new level. Anyway I hope you've enjoyed seeing the pictures and if you're inspired to start working on things, once again I can recommend taking a bit of a look on AusChristmasLighting for some further inspiration. But be warned - it's not something that you're going to be doing if you've only got a budget for $500 or less.

But ... I am keen on working out what kind of show you could build for $500(ish) if that's something people are interested in next year. There are some good shortcuts and devices you can buy for some light sequencing paired up with some traditional AC-style lights that I recon could be a winner. Also - it looks like Bunnings are getting in on the action with a peculiar Christmas Tree that has individually addressable lights. You can be sure that people are hacking these already to work with the WS2811 protocol. Costco even has these 'Twinkly' branded trees at the time of writing for $199. In any case, it's not a cheap hobby and if you can do some of the tinkering yourself and you already have Ardunio's, Raspberry Pi's and are set up with enough to be dangerous, you can do most of it yourself and save on those dollars.

 Anyway - until next time. Ciao! 

 


10 Things I Wish I Knew Upfront About Amazon RDS for SQL Server

14 September 2021

Amazon RDS for SQL Server is Amazon's answer to managed SQL Server Instances in a similar vein to Azure SQL Managed Instance. You're provided with a functional SQL Server instance without the ability to log into the underlying servers. This makes it a great option for teams wanting to remove the routine patching and instance-style snapshots from their daily activities, however it also means many of your tips and tweaks you've learned as a SQL DBA over many years may no longer apply.

Amazon use several magic tricks to make the whole process work at the expense of masking or - if you're lucky - moving your typical DBA management style functions to new locations. If you've read the sales pitch about how RDS makes it easy to set up and operate, you might be forgiven to think this is a direct drop-in replacement ready within days or weeks at the latest and depending on your workload, you might be able to overlook some of what I'd consider 'missing features'. In fact, the Sales Pitch isn't wrong but I'd suggest that you really consider the impact of managing your own OS patching and SQL Server upgrades versus losing the ability to perform database backups your own way, or tweaking esoteric permissions, or even managing your system indexes and file placement. You're going to find out pretty early on that all that knowledge you learned from your time managing Self-Hosted SQL Server instances (in any capacity), needs to change:

  • provisioning - hopefully you're doing this only once for Production.
  • backups - this happens, just not the way any SQL Server DBA will expect it to happen.
  • software patching - yes, majors are a request click away and minors can be automated or manually ordered.
  • monitoring - for hardware failures yes, but this is very much a shared responsibility model.
  • hardware scaling - the basics of any cloud or Virtual Machine tool really.

Don't get me wrong - you can create a reasonable platform within Amazon RDS for SQL Server - if responsibility for managing this particular instance is something you want to remove - or perhaps you're a team without a traditional DBA role already. But I'd also suggest if you're already managing a fleet of servers, an additional SQL Server isn't going to make any considerable change to their workloads - especially if you want to treat your SQL Server cluster as the 'holy grail' within the organisation.

1. Read the User Guide for SQL Server, and Read it Again.

Read. This. Guide. I'm definitely one of those that skims through documentation seeking things that may stick out as red flags - especially when there's so much of it 'scenario' driven (i.e. feature X doesn't work when RDS is configured this way, or workaround issue X by doing something that seems quite destructive and counterintuitive - we will get to those). I spent a little time upfront going through the Limitations section expecting I'd find out what areas to focus on and build workarounds for just by sticking to that section alone. What you don't get from that is the sense of how 'limited' a particular function will actually be for your workload.

While you're unlikely to remember everything from reading the guide, or be any clearer about how you're going to deploy this wild and untamed beast, you'll hopefully have trigger phrases ready to come back to when you go about diagnosing why things don't work. Make sure you also read through the Common DBA Tasks to get an idea of the scale of changes to your normal every day processes that will require testing and rework - everything from creating logins, users and assigning roles, through how your database automatically resolves a recovery type based on some semi-related parameters you set (this one caught me out by surprise when working with replication via DMS) and one of many limitations working in Multi AZ environments.

2. Backups and Restore Procedures.

You've read through the manual and probably didn't see anything about Backups except some loose references to importing / exporing data and a list of limitations much longer than the product itself, and Amazon RDS's overall backup strategy. Simply put, this is the weakest area of Amazon RDS for SQL Server.

First thing to note - you are not backing up the individual databases, you're snapshotting a fully functional instance with Point in Time Recovery capabilities. This is an important distinction that is difficult to explain to your powers that be who have never had to get in the weeds and recover data corruption. In fact, I don't think in my entire history of working with SQL Server I've ever found myself restoring a full instance from snapshot - or a production database in full. Typically, your production workload is still operational with corrupt data - so you attempt fixing data by writing queries to update bad data with good data.

To do this, you'd use Transaction Logs to restore a copy of the database to your specified Point in Time (via SSMS) and will begin either writing queries or using Data Compare tools to write expected data back to where it belongs. Depending on how big the database is, this is usually a several minute process especially when your database size is < 100GB, and Transaction Logs are available on the host. Obviously clustered environments will take longer with the Synchronisation process however it's typically a fast resolution if you know the time an issue occurred.

This leads to the second point. Amazon RDS for SQL Server does not support backing up Transaction Logs. You kind of have access to Differential logs, however the Instance Snapshots are apparently tied into making use of Transaction Logs meaning the chain can be broken if they did allow you access to them. So it makes sense why they are not available - but it doesn't make recovering Data Corruption kinds of disasters any easier.

To work around this, you may find yourself building processes that include restoring an instance to a Point in Time (you'll get a brand new host name and you'll need to update your firewall rules, IAM policies, Security Groups, etc...) - this will take some time to complete - ensure you do this without Multi AZ unless you want to double your restoration time. You'll then need to make a decision on how you're getting the database over to your existing instance. You could do this by using Native SQL Backups and Restores, and you could also do this via establishing Linked Servers or using some sort of Data Pump to get the relevant data over to your new platform. All of these will impact your RTO / RPO objectives, so it's important to Test, and Test again.

In case you needed to compress your native backups, you'll need to run a custom procedure to globally enable, run your procedure to do the backup and disable it later (assuming you don't want all backups to be compressed). I'm fine with compression for most tasks - but if you have governance and compliance reasons that you don't want to subject to possible CRC issues, then you'll want to be familiar with this.

I'd certainly recommend you come up with Run Books to help you perform this kind of action, because as you're going through the process - you'll find AWS isn't very good at telling you when you've made a mistake. Case in point - I started a restoration whereby I forgot to set the right Option Group - and I wasn't told until 90 minutes after I requested the restore to complete that the restore failed due to a missing option that wasn't available.

You may also find some strange behaviour in your testing whereby restores just Fail, so you're going to want to become well acquainted with your Premium Support providers in case this sort of behaviour happens in your critical production workloads. I must say, AWS have been pretty responsive in this department, but they do have SLA's you're going to want to be mindful of.

Finally, you may want to consider only having one database - or at most an application per instance. I realise this is a freakin' expensive way of doing it, however it is in line with AWS's one service per application model which removes much of the complexity in workarounds. This is a timely point to lead into the next issue...

3. Know your Costs - it's more expensive than you'd think!

Amazon RDS for SQL Server is expensive. Running a Multi AZ z1d.2xlarge SQL Server Enterprise instance with a modest 8 CPU / 64 GiB RAM footprint will set you back around $8 USD per hour before you add Provisioned IOPS, Automatic Snapshot Retention, SQL Native Backups and Data Transfer costs. Adding Read Replica's to this pool and you're talking over $10,000 USD per month.

Running EC2's doesn't make this a lot cheaper (cents in the dollar), however I would argue that running a larger cluster for all of your database requirements including adding separate instances gives you greater cost control when you have several applications to support (let alone the complexities you remove by having more of your databases in a single location).

It's also worth noting that for development environments, you cannot run SQL Server Developer Edition (licensing restrictions, as Amazon are running your environment) - so even in your Dev / Test environments, you're going to be paying for full SQL Server instances. You can run SQL Server Express if your application supports it, but it's hardly a good option if you need to profile your queries and other things typically involved with any sort of SQL query development.

If you're trying to remove the complexity of Server patching (which would ordinarily be done by your Server Engineering team anyway), and failover for SQL Server when doing so, it's worth comparing the limitations and workarounds to see if there really is a cost saving, if you're using largely the same staff to look after other workloads.

4. You're not really the DBA, Amazon is! And even then, they're automated bots.

This part cannot be understated. There's a gap between Amazon providing a managed service in the form of servers that can be automatically patched with some pretty smart AI behind it to do typical functions, however bots aren't particularly good at understanding the personality of particular workloads. For example, if a bot decides to increase your storage because it thinks your database is bigger than it really is - you may find yourself stuck like a Roomba that lost it's sight and continuously bashes itself into a wall. I've been in this situation and raised tickets with AWS only to confirm really that we really have to wait for the Bot to time out or work out what's wrong with your database.

Sometimes, you just need to pull the plug. If however you find your server stuck in a storage optimisation loop - you can't pull the plug or failover. That means that if your performance is pathetic, you have no levers to pull - except wait out the timeout loop with AWS. That's not to say they can't do specific things, but you're really at the mercy of someone who doesn't understand the characteristics of your workload and what kind of risk you may want to take - versus all their other customers.

Similarly, you're not the root user. At best, your a processadmin and setupadmin which affords you some rights to the server along with several esoteric grants and roles you're added to. If you're adding other users to 'administrate' the database, you're going to want to plan ahead for this. You may want to add Active Directory groups, Server Level roles and Database Level roles armed with scripts to add the necessary grants especially for your Dev / Test environments. You'll find yourself changing things slightly when it comes to managing SQLAgentUserRoles because you're heavily restricted in the msdb database. In any case, you're going to want to arm yourself with several scripts to make this process consistent and easy. I find myself writing lots of cursors and interrogating the database to copy permissions to several groups and it took several goes to get the right methods in the right place. Determine this upfront, or adding users individually to databases will become your destiny - and you don't want that in a cloud environment!

You don't have access to maintain System Indexes and Statistics. If you're an Ola Hallengren script user, you'll note that you probably have backups and regular maintenance jobs scheduled in SQL Agent jobs. You'll need to tweak these scripts to ensure you are building and running what you have access to.

Finally - ensure you know how to work with AWS Premium Support. I've faced many issues that I've needed explanations for - due to document clarity or undocumented 'features'. You don't want to find yourself in the middle of an incident only to realise that you have no levers to pull, because you're not aware of the limitations. You'll also need them to interrogate logs for you from time to time.

5. SSRS support is utter garbage.

SSRS and SSIS support have been added in the last 12 months to the RDS platform. I haven't looked into SSIS support, but understand there are several limitations around third party libraries and access to certain functions common among scripts we have in place. However, I had the unfortunate experience in working with SSRS.

Best to avoid this one if you use SSRS in any particular capacity. SSRS Portal does not support custom domains and yet it requires SSL. Because the certificate is self-signed, you'll find yourself finding the specific region and installing the Root CA onto your computer. If you've got a fleet of 2,000+ devices, you're going to be making Group Policy changes that would otherwise be unnecessary if you have a custom domain. To add to the problem, your users that may be used to visiting a URL such as https://reports/ would now need to remember https://your-database-id.your-account-random-characters.ap-region-no.rds.amazonaws.com:8443/ - that's a mouthful at the best of times. If you're using CloudFormation, you're not going to want to set an instance ID so you'll get strange characters like: https://abcd1234efgh5678.ijkl9012mnop.ap-southeast-2.rds.amazonaws.com:8443/ - easy right?

Even if you accept that madness, you'll then find yourself buried in esoteric stored procedures to add users to the portal. If you do this, they become Management users before you need to restrict access. You can't schedule reports to e-mail or store on a network share and the kicker - you can't use SSRS with Read Replicas in Multi AZ environments.

This one flabbergasts me the most that such an incomplete feature went GA. I'm not sure who would have a use case to use SSRS with the limitations above.

6. The perils of Multi AZ environments.

It stands to reason that if you're running mirrored databases that there will be some limitations to ensure integrity between your database instances. However, it would seem you have very limited options you can perform when running Multi AZ environments and some issues to sort out before going live. Things you will want to be aware of include:-

  • You cannot rename a database. 
  • Your SQL Agent Jobs and Linked Servers are not replicated to your other server and you have no way of logging into the other server without performing a failover.
  • You cannot restore DIFF backups (if you were thinking this would be a good stop gap solution to lack of Transaction Logs, this falls over here).
  • Failover is not an immediate action - it takes around 40-50 seconds for a failover to occur. It's not bad, but if you need 100% uptime this may be a concern.

Most of these actions require you to convert your instance to a Single AZ, running the risk of an Availability Zone failure causing damage or outages to your database. And here's another kicker - you would also have to remove any Read Replicas you have set up prior to performing those actions as they only support Multi AZ modes.

Amazon provide a pretty comprehensive list of activities you're going to want workarounds for. I'd certainly recommend that you familiarise yourself with the list and test out scenarios that are quite destructive (i.e. removing Read Replicas, moving to Single AZ mode and back).

7. Available replication options - Read Replicas vs DMS.

This one was the biggest time sink for me. If you have a requirement on a Self-Managed SQL Server instance to provide a replicated data source for third parties to access, you might consider setting up a publisher and distribution model which comes out of the box with SQL Server. It's pretty easy to set up and data will replicate including all programmables over to the target.

It's not fool proof - it's still subject to the various integrity rules embedded in your database, but the advantage here is that you can access real-time data without impacting your primary database. You can add new indexes, tables, queries etc... and because it's on a separate instance - other users can create other databases on that server. It's dirty in my book to do this from a data integrity perspective, but it fits a niche data mining requirement.

First thing to note in Amazon RDS for SQL Server is that distributions are not supported so that rules out setting up this model (it can receive data, just not publish it). This leaves you with a few possible options.

Amazon provide a service called Data Migration Services and as the name implies, it's a tool used to migrate data from one data source to another. It's not a drop in replacement for replication, but it's purpose is to keep data in sync to help facilitate migrations from specific sources to destinations and can help with data transformations along the way. It's not a silver bullet though, and it's not magic. It's not aware of the esoteric nature of SQL Server and as such will not copy over your programmables, has various limitations around supported data types and in the time I used it, it wasn't particularly reliable.

In order to set DMS up, you'll find yourself performing a number of steps. The first SQL Option you'll need to add is MSDTC. This doesn't seem to have any material impact to performance but requires you to configure a non-default Option Group (you'll be doing this for Native Backups and SSRS anyway if you're using those). Once this was set up, a Full Replication task was created and managed to see a schema with data in it over at the target. Ongoing replication requires CDC to be enabled - to do this, it's not just a case of turning on the job, you'll need to enable CDC per table - which also introduces various tracking tables. So a capture job, configuration for polling interval, CDC stop and start underway, I interrogated the old INFORMATION_SCHEMA.TABLES for specific schemas and ignored system tables. I'd check if there's a Primary Key (or not) to appropriately set the @supports_net_changes flag in sys.sp_cdc_enable_table. Either way, it was a lot of ugly hacky-scripting to enable some kind of equivalent to the publisher / distributor model mentioned earlier.

Things kind of worked - at least the dashboard was demonstrating data migrating across and querying the other end certainly shown data being replicated. Unfortunately, it would 'fail' from time to time and completely stop transferring data due to validation. I couldn't really understand why either - the data in the Source was needed at the Destination. A few support tickets later, and no week where we went totally error free - it was time to pull the plug. To productionise this would require a way of sending over all Stored Procedures, Functions, etc... ignoring Foreign Keys (as even though CDC should guarantee transaction order, for some reason it wasn't).

One particular issue identified by AWS Premium Support via logs even pointed to the fact the database I was testing was in SIMPLE recovery mode. So back to Point #1 in this list - it completely slipped my mind that by disabling Automatic Snapshotting in AWS for our Dev Test environment, that I'd inadvertently disabled FULL RECOVERY mode. After a few more weeks of testing, we still had the same issues where CHECKSUM columns were not being calculated correctly and Foreign Keys would prove to be a problem for use in Power BI.

That leaves the second option which we are exploring via Read Replica's. The main challenge here is that we would be spinning up a dedicated instance such that our primary database will not have a performance penalty. We would also require SQL Server Enterprise Edition. The Read Replica works exactly as you'd expect with some hiccups along the way. I've already noted that you can't have SSRS enabled when you have Read Replica's enabled, but perhaps the biggest offender is that it won't run in Multi AZ mode. So if you are going to run Read Replica's, you'll need to keep in mind that you'll need to delete your Read Replica if you plan on moving back to Single AZ for any particular maintenance requirement. You will also find that your security (credentials and users) are copied across to your Read Replica as well - so segregating reporting style users from the Primary database may require some bespoke work too (you don't want your users accidently connecting to your primary instance!).

I'm still not sure that Amazon RDS for SQL Server really provides a good resolution for our replication scenarios. We wanted an OOTB experience rather than building curated sets of data - which is of course what we should be aiming for, but requires some dedicated work to get this sort of stuff right.

8. You can't do things while those AWS bots are doing things

You won't find this one out until you're in a hurry to do something. Let's say that you want to change an instance size for performance reasons, or perhaps you want to change some options on your server. Most of the time, this just works - that is until your Database Instance decides it wants to do a maintenance activity outside of the allocated window. I found myself several times fighting the RDS console trying to do anything - backups, restores, failovers, you name it because Amazon decided that Storage Optimization was required. Be aware of it, because you can't cancel the activity and if your database requires that maintenance urgently, there's not much you can do except wait it out. In one case, it took over 12 hours to optimise the storage volume from 500 GiB to 1,000 GiB. There was still some 400 GiB free and this seems to occur totally randomly.

9. Shrinking Data Storage Volumes if they auto-grow

Amazon RDS allows you to specify a Minimum and Maximum value to auto-grow of storage. This seems like a great way to set up parameters such as 500 GiB to start with, with auto-growth to say 2,000 GiB. Storage isn't hugely expensive - at $0.138 per GB, 500 GiB will set you back around $69 USD. But if your server finds itself auto-growing to 2,000 GiB, due to a bug in SQL Server reporting the wrong database storage size, you may find yourself auto-growing to 2,000 GiB and logs complaining that it can't grow further than that. So now you're paying $276 USD per month for storage you're not using, and didn't need - because of a bug.

You might think then that you can simply auto-shrink to reclaim that space (I'm not talking about database shrinking here - which for a whole host of reasons is inadvisable), but alas - no. You'll be building a new instance, dumping out a backup (remember that Native Option earlier? You're going to want that in place...) and unless you have the right provisions in place - you may induce downtime. If you're not using CloudFormation or other Infrastructure as Code tooling, you'll also find yourself updating IAM Policies (assuming you are doing the security thing!) and Route53 entries at a minimum if you have applications relying on hostnames existing.

Luckily this has only happened perhaps 3 times out of 30 provision and teardowns.

10. CloudFormation is incomplete.

This final item is not unique to Amazon RDS for SQL Server, but it's worth calling out. I started out by providing a series of nested CloudFormation templates and eventually moved them over to our CI / CD tool to run individually and take advantage of SSM to store known locations. For the most part, this process is awesome. The process builds S3 buckets, provisions IAM policies, provisions Option Groups, Parameters, Instances and runs a series of scripts to work through the limitations above (i.e. user role creation etc...).

But there are gaps. Amazon RDS for SQL Server does not support the creation of domain connected Replication instances via CloudFormation. Your immediate reaction may well be to ClickOps your way through the problem - after all, this will only occur once in Production.

For various reasons, especially in Multi AZ environments where you will be winding back Multi AZ to Single AZ for any notable changes requiring this, you'll want to have a set of templates and scripts that does this for you. If you don't, those policies, DNS entries, security groups, NCL's, and other components will all require manual manipulation or at least some crafty scripting to put them all back in place. At this point, there's probably an argument to say get onto Amazon CDK or another third party product Terraform, Pulumi perhaps?

Another issue to be mindful of was (and I can't remember how to replicate this one fully), I made a change to an instance type and wanted to make it Multi AZ. The reason you don't want to specify a Database Identifier in CloudFormation is reasonable - you may find yourself needing to build a new instance which will cascade down through your template and make the necessary adjustments. However, when this happened - CloudFormation did indeed snapshot the old instance, but it did not restore that instance to the newly provisioned instance. So I ended up losing the primary database, and had a blank, fresh new instance ready to go.

If you are using Infrastructure as Code as part of your process, you're going to want to tease out these sorts of changes to make sure come game day when you do need to make a change - you're not going to blast away your entire instance. The process itself may take an hour or two to complete, you don't then want to be kicking in your runbooks for restoring an instance from Snapshot adding further hours. That could turn into one mega-outage.

Final Thoughts...

If you're considering adding Amazon RDS for SQL Server to your stack, please take the time to understand the typical BAU activities that your DBA's conduct and ensure they are part of the journey. There are a lot of processes that need to change by adopting Managed Database Instances like this. Also be prepared for a lot of pushback. Those new to the product, having read the sales pitch and having gone through the pain of setting up clusters before may have high hopes for a platform such as this. It's a journey and perhaps the biggest struggle to showcase exactly where the limitations are and how they have real-world impacts to managing your application especially in the case of a disaster.

I'd suggest that if you're running Tier 2 / Tier 3 style applications (those that are not mission critical), then you may find that Amazon RDS for SQL Server is a reasonably adequate product, given the limitations to handle are perhaps less of a concern. However, managing Tier 1 applications with lots of variables and changes you may need to coordinate - I'd suggest that you may want to reconsider what you're trying to save as a result of using Amazon RDS for SQL Server over say running the EC2 variant in AWS.

In any case, it's fair to say that SQL Server in general is probably getting to it's used by date for new applications. As great of a product it has been over several decades - and certainly my most familiar RDBMS, the way we write and consume applications has changed. Perhaps those data mining activities of 10 years ago - while doable in an RDBMS might become part of the application itself, or you may want to invest in emerging technologies like Athena to help change the mindset that you must query specific resources a particular way. But also consider the staff you have. If their skill base is in SQL Server, moving to new platforms is not a silver bullet. Your team will need to 'want to' take it up - otherwise you'll create a bigger logistical nightmare to manage.

Cloud is complicated. There's more that can go wrong and you have to design for failure. As such, the comfort of a single environment with all the bells and whistles doesn't usually fit a highly available and scalable landscape. That sort of design is not for the fainthearted. There's a lot of leeway you get when you have total control of a stack - it allows you to write incredibly bad and unreliable code, reliably.

I would suggest that if you really want to adopt Cloud, you're going to want to take the serverless concept seriously. Work with asynchronous queues, write front-ends that are fault tolerant (i.e. never assume the connection is there, or that things are consistent), architect with state machines, queues and notifications in place and no longer rely on SQL Server (or any RDBMS for that matter) as your single source of truth. You may need to look at GraphQL databases, NoSQL databases and other tooling for storage. If you need to Glue that altogether, you can.

But it takes time, a lot of patience and I dare say you need a legacy team to keep the lights on while your second group work through a genuine shift in paradigm. You'll need the right skills and mindset to architect for failure scenarios - this isn't a skill that all people have and it's been particularly notable in legacy style developers.

I realise this last bit is a tangent, but it really sums up where cloud is advantageous, and where it's not. I struggle to see how the benefits of Amazon RDS for SQL Server pay off in an environment where you mix Self-Managed with Managed Services and I struggle to see why you'd only ever have Tier 2 and Tier 3 style applications in your environment. But alas - it's possible.

 


« < 1 2 3 4 5 6 7 8 9 10 11 12 13 > »