Forem: Conner Phillis

No Indexes, No Parameters, No Problem

Conner Phillis — Tue, 07 Apr 2026 20:56:16 +0000

This story is from an organization I worked at a while back. A team had recently lost all of its developers after they voluntarily left the company. Big red flag - but I was young and eager to prove myself, so I volunteered to take it on. The app had a multitude of issues: poor separation of concerns, copy-pasted code everywhere, awful performance, and outdated tech

I could probably write a whole series of posts on that application - it ended up costing me quite a bit of hair - but I wanted to talk about one of the craziest things I found.

The Bug

The first order of business was to fix the bugs that were preventing us from moving forward with customers that were using the software. There was a pretty standard create request that would fail somewhere inside of a 9,000 line function (yes... really...) for an unknown reason and customers were losing data. It had to be fixed.

The debugging tooling was terrible. The server was at the company's data center with no remote debugger access, so I had to build locally, drop DLLs on the machine loaded with logger statements, and read the output to figure out what went wrong. To make matters worse, the API always returned a 200 after swallowing the exception regardless of whether it succeeded or failed. Unfortunately, the fastest way to verify my changes was to query the database directly and check if the row inserted.

The primary key for the newly inserted record was something like 205, while the one previously was something like 190. This was weird - the entire time that I was debugging the application we hadn't even gotten to the point where we were inserting rows, there's no way we should have skipped that record.

The issue was fixed - so I had time to figure out why it was doing that. I hadn't previously gone line by line on the function, in favor of focusing on the area of the code that was failing. At the start of the function, was an ominous looking function call:

var id = DbUtils.GetNextId("TableName")

dbo.Ids

Weird - I thought, why would we need a function to get the next id? SQL will create that for us, right?

The function definition contained a bit of SQL that looked something like this:

SELECT id FROM dbo.ids WHERE tableName = '<table-name>'

followed by

UPDATE dbo.ids SET id = id + 1 WHERE tableName = '<table-name>`

To date, I have yet to see anything in my career that made my jaw hit the floor that badly. I quickly selected against dbo.ids and found that every single table in the database had an entry in this singular table. Every single entity had to call DbUtils.GetNextId to get its next id.

Not only were we not using auto-incrementing primary keys, the operation to get the next id wasn't even atomic. If two calls came in at the same time and selected before one updated, two entities got the same id!!

Of course, I thought, at least the primary key constraint on the destination table should enforce uniqueness and keep us from getting into a bad state like that - that's just common sense, right?

Just in case, I opened SSMS & expanded the target table's indexes.

And it was empty.

I did a double-take, reloaded SSMS, and still empty. I checked my permissions to the database, nothing wrong - I was logged in as an SA. I checked another table - and empty there too.

Not one index in the entire database. Not a foreign key, a unique constraint, a primary key, nothing.

The dev prior to me hadn't known what an index was - and instead of doing research to figure out if there was a better way to do what he wanted, he just crapped out something that seemed like it worked.

Customers had always complained that bulk operations took forever - and now I knew why. Every single row insertion required at least three round trips to the database: two for the manual ID generation, and one for the actual insert. If only there was a better way...

The Aftermath

I ended up doing an audit of the entire codebase after that. It got worse. Not a single database parameter in the whole application. Every query was built with raw string concatenation. We were wide open to SQL injection.

There were enough bad findings in that application that I had to talk my boss in to putting an immediate moratorium on further sales. We had to rebuild the entire thing from scratch.

There were further issues to fix at customers we'd already deployed to, and the rewrite took at ton of work - but at the end of day cleaning up that hot pile of garbage taught me more than any class in my university could have. As many sleepless nights as it took - I would have to say that without that experience I wouldn't have learned nearly as much as I know today. In the end, it led me to a career I enjoy that keeps me happy and fed.

Anyways - I've always enjoyed reading software horror stories. Figured it was about time that I wrote my own worst software story. Thanks for reading - feel free to tell me about your own in the comments.

no AI was used to write this post, this is just how I type lol

Optimizing CI/CD Pipelines: How Dynamic JavaScript Configuration Streamlined Our Deployments

Conner Phillis — Mon, 17 Feb 2025 14:00:12 +0000

I used to frequently run in to problems building CI / CD pipelines at both my previous and my current organization.

Often times, I would be building a website that would need to be deployed multiple times, either internally or externally. Years ago, when I was researching how to support rollouts like this online, the general guidance that I found was that I should be creating a .env file for each environment, and then letting my build process sub in those environment variables.

Years ago, the general advice I came across was to use a .env file for each environment, and have the build process substitute the appropriate environment variables. While this worked in some cases, I quickly realized that managing separate builds for each environment was inefficient. Especially when deploying the same website to multiple locations. This approach was increasing the cost of our rented CI/CD agents and artifact storage, and adding unnecessary complexity to our workflows

One possible solution that we piloted was to try to load the configuration into an API that would be fetched at runtime. While this seemed like a viable solution, it felt problematic. The primary issue was that we'd need to run an additional fetch call in our app, which slowed down our load times. Plus, we had to modify every API request or introduce guards to ensure that the configuration was loaded beforehand. This added unnecessary complexity and potential failure points.

What I eventually settled on was creating a JavaScript file to inject these settings into globalThis. By including this file at the start of the document, we ensure that the configuration is available as soon as the app code begins executing. This method provides a few key benefits:

Single Build, Multiple Deployments: Since you only need to run one build, you can generate your deployment artifacts once, and then customize the settings for each environment by using unique settings.js files for each deployment.
No Need for Extra Checks: With the configuration already loaded when the app starts, there's no need to worry about checking whether the configuration has been successfully loaded. The app can rely on it being available immediately, which simplifies the code and removes unnecessary validation steps."


<!-- index.html -->

<head>
  <meta charset="UTF-8" />
  <link rel="icon" href="./assets/favicon.svg" type="image/x-icon" />
  <script src="/settings.js"></script>
  <script src="/main.js"></script>
  <!-- remaining site assets ...  -->
</head>

Here, you can see that the settings.js file is loaded in the same breath as the main.js file, inside of that settings.js file we have the following code:


// settings.js

const siteSettings = Object.freeze({
  // sso
  clientId: '<client-id>',
  tenantId: '<tenant-id>',
  audienceId: '<audience-id>',
  // api
  apiAddress: '<api-address>',
  // auditing
  metricsEndpoint: '<metrics-endpoint>'
});

globalThis.configuration = siteSettings;

When this script is loaded, it declares a new property against globalThis which holds all of our site specific configuration.

A few notes that I should add:

NEVER EVER store sensitive configuration information in your settings.js files - If you do this, it's essentially broadcasting your secrets to the world. This probably goes without saying, but its better that I spell this out before someone blames me for their secrets getting leaked.

Potential Namespace Pollution - I recommend that you choose a unique name (not configuration) when assigning to globalThis or window. There is no guarantee that one of your dependencies hasn't decided that they want to use globalThis.configuration to hold some global state that it needs. You should use a globalThis key that gives you reasonable assurance that no other code is writing to it. Think globalThis.<my-app-name>_settings.

Keep it lean - don't put anything in your settings.js file that you wouldn't put in a .env file. This file should be lightweight, no massive base64 strings for your favicon, or other weird stuff like that. Keep it to keys and values so your users have no idea that there was a couple lines of extra JS loaded.

Conclusion

I've found this approach great for simplifying CI/CD pipelines. It has reduced the total executions of our build pipeline and made it much easier for us to manage our release builds.

It does come with trade-offs; there are some security risks (particularly with exposing sensitive configuration values) and potential conflicts in the global namespace. However, it is important to note that these are not risks introduced by the method, rather they are risks inherent to developing public applications.

By ensuring sensitive data is handled separately and by implementing best practices around naming and performance, these drawbacks can be mitigated.

As always, feel free to critique this in the comments. I am always happy to create revisions and offer corrections. I recognize that this is a rather niche option that most people wouldn't be able to take advantage of, but I still think it should be published for people with similar issues.

note - My organization uses a template settings.js file and replaces the individual keys of the settings using environment variables. If there is interest in this approach I can post and link that approach as well.

Sequential GUIDs in Entity Framework Core Might Not Be Sequential

Conner Phillis — Fri, 24 Mar 2023 16:15:57 +0000

Edit April 27, 2023 - The Entity Framework Core team has since opened this issue in response to this article.

The Background

Our customers more often than not chose to host our application on their own machines, so we frequently get asked what the minimum hardware requirements are. We based the estimates we provide on the requirements of similar applications.

With our latest release we decided we'd get a conclusive answer for ourselves, so we put some resources into running benchmarks. We settled on a simple setup. Write a simple script that would simulate a series of requests that would run through hot paths, and see how many operations we could complete in a fixed time frame. The script would run X number of concurrent requests for N minutes, log the statistics to a CSV file and export our results to a CSV file for analysis.

We architected our test server to simulate an organization stressed for resources. On a single virtual machine we installed SQL Server, IIS, and our application. For the hardware behind the virtual machine we used an Azure F4s v2 (4 vCPU, 8GB).

For our warm up, we ran the script with 20 concurrent tasks for 10 minutes, the results that we got?

263724 total requests completed
67431 form submissions in 10 minutes across 20 tasks

While this may not seem like a lot for some, this was great for us. We consider our workloads somewhat computationally expensive, and didn't imagine we would get these sort of numbers out of our code. Especially when hosting the server and database on the same machine.

Our logs indicated that we were on average consuming about 70% of the CPU. The data that we got was plenty for us to determine our hardware requirements, but just for fun we decided to see how far we could push it. We resized the VM to an F8s V2 (8 vCPU 16GB) expecting linear results.

The script was set, 50 concurrent tasks instead of 20 to account for the increase in core count, running for ten minutes. The results?

275532 total requests completed
68883 form submissions in 10 minute across 50 tasks.

What!?!? We doubled the hardware, 2.5x'd the number of concurrent runs, and ended up with only ~3% more completed requests. This set off an alarm for us, we obviously had a large issue with the scalability of our application.

Investigating the Issue

The first thing that we theorized was that the increased number of tasks was causing problems with IIS, causing connections to stay open for longer than they should. We altered our the parameters of our test script to use 20 tasks over 10 minutes, mirroring the test against the F4s machine. After 10 minutes, the results were...

275916 total requests completed
68979 form submissions in 10 minutes across 20 tasks

The same?? There was only a marginal difference in the results. Less than 1% from the original run. The test machine was hardly using a fraction of the processing power and network it could utilize. Something bigger was afoot.

We started a Remote Desktop session with the server and ran another test, 10 minutes, 20 cores. We observed SQL Server start by consuming ~30% of our CPU time, and watched it move up to as much as 60% of the CPU by the end of the run. Over time, our performance was getting worse.

On a whim, we ran a query to check for index fragmentation of the database.

The index fragmentation was far above what could be expected out of a healthy database. North of 50% for some indexes. While we can't prove right now that this is what is causing our scaling issue¹ it does explain how SQL server can continuously need more resources. As the size of the data grows, SQL is having to spend more time doing table scans and expending more resources on IO.

We found this puzzling, we were using Entity Framework Core's Sequential Guid Value Generator With the DatabaseGeneratedOption.Identity option. The documentation states:

Generates sequential Guid values optimized for use in Microsoft SQL server clustered keys or indexes, yielding better performance than random values. This is the default generator for SQL Server Guid columns which are set to be generated on add.

It's important to note in addition to this documentation for those that aren't aware, setting a column to use a GUID as a key with DatabaseGeneratedOption.Identity does not mean that it will be generated by the database. Instead, EF Core generates the sequential GUID itself, and then inserts it into the database (read here). This can be observed when comparing GUIDs generated normally to those generated by NEWSEQUENTIALID later in this post.

Additionally, this issue in the EF core repository shows that EF core generates GUIDs better than SQL Server does. The documentation wasn't lining up with what we were seeing, it was time to recreate the EF tests, and see if we could simulate the behavior we were getting from our server.

Running our Own Benchmarks

The first thing we did was see if we could reproduce the test done by roji on the EF core team with 100000. And...

Method	Average page space used in %	Average fragmentation in percent	Record Count
NEWSEQUENTIALID	99.91 %	1.04 %	100000
EF Core Sequential Guid Value Generator	99.86 %	0.56 %	100000

Same results as the team found. The EF Core value generator is still generating GUIDs optimally as of SQL Server 2022.

But wait... this isn't really how a web server works. Entities aren't just inserted one after another when coming from a web server. Entries are created in response to user activity, and that can happen whenever. Database activity happens spontaneously, whenever a user performs an action, and different user hardware can mean these operations can take different amounts of time. What if we modify the test, instead to simulate a large degree of parallel actions rather than pure sequential inserts?

We altered our script, instead of inserting 100,000 sequential ids into the database, we created 20 tasks, and told each of those tasks to insert 5000 rows into the database. Once this was done we looked at index fragmentation again.

Parallel Entity Framework Sequential Guid Generation

average page space used in %	average fragmentation in percent	Record Count
57.93 %	44.53 %	100000

Multithreaded Simulation Code

class Program
{
  static async Task Main(string[] args)
  {
    await using var globalCtx = new BlogContext();
    await globalCtx.Database.EnsureDeletedAsync();
    await globalCtx.Database.EnsureCreatedAsync();
    await globalCtx.DisposeAsync();

    var counter = 0;

    var tasks = new List();
    for (int i = 0; i &lt; 20; i++)
    {
      var t = Task.Run(async () =&gt;
      {
        await using var ctx = new BlogContext();

        for (var j = 0; j &lt; 5000; j++)
        {
          var value = Interlocked.Increment(ref counter);
          ctx.Blogs.Add(new Blog { Name = "Foo" + value });
          await ctx.SaveChangesAsync();
        }
      });

      tasks.Add(t);
    }

    await Task.WhenAll(tasks);
  }
}

public class BlogContext : DbContext
{
  public DbSet Blogs { get; set; }

  protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    =&gt; optionsBuilder.UseSqlServer("Server=.;Database=Testing;Trusted_Connection=true;Encrypt=false;");
}

public class Blog
{
  public Guid Id { get; set; }
  public string Name { get; set; }
}

The top 10 results returned when querying the database illuminate the issue:

Our conclusion? Entity Framework seeks to create an efficient value generation strategy optimized for SQL Server, but after the network stack has its say, its likely that some rows will be inserted out of their original order.

Compare that to the results that you get when running the same code, but setting HasDefaultValueSql("NEWSEQUENTIALID()") in the OnModelCreating method in the database context:

Parallel Guid Generation with NEWSEQUENTIALID()

average page space used in %	average fragmentation in percent	Record Count
96.03 %	7.67 %	100000

The fragmentation percentage is still not as good as inserting the rows one after the other, and the average page space used is a bit lower, but I think we can all agree that it's better than generating the IDs in memory with Entity Framework Core.

This method has drawbacks too, however. Looking at the GUIDs that SQL generates it's hard to say that they have the same uniqueness guarantee that standard GUIDs have. It appears that the leading bits of the GUIDs are all that change when taking a sample of the first 10 inserted in the database after our concurrent test:

(in case anyone is curious, generating the GUIDs randomly led to a fragmentation percentage of almost 99%)

Studying the Issue

There were two main benefits that initially brought us to use GUIDs as primary keys in our database.

We sometimes have to export data across servers, so the (near) uniqueness guarantee meant that it should be trivial to merge the data
Certain actions don't require our users to be connected to our server all the time as long as they do a periodic sync. In this case we could let the client generate IDs and after the sync turn the IDs into sequential ones. Once we were done with the transformation we just had to inform the client of the new IDs.

Unfortunately, the SQL server GUIDs don't seem like they would be able to cut it for us, as it seems likely that a collision could occur when exporting from one server to another.

This led us to a tough crossroad. Do we

Keep going, knowing that scaling up our application leads to highly diminishing returns necessitating expensive hardware OR
Lose the benefits GUIDs give us in favor of another primary key format that would be better suited for parallel inserts.

Ultimately, we decided that our best path forward was to go with a hybrid approach. We would alter our tables to have two IDs where GUIDs are required. This involved using an integer primary key generated by the database, and GUID value as a non-clustered index with a unique constraint. These GUIDs would use the SequentialGuidValueGenerator to try to "presort" some of the items in the non-clustered index, but we wouldn't enforce that it had to be a sequential GUID.

After performing our parallel benchmark, we ended up with the following results:

Hybrid Key Generation Approach

average page space used in %	average fragmentation in percent	Record Count
94.15 %	10.38 %	100000

Just in case we ran the benchmark again with only an integer primary key, that yielded a fragmentation percentage of almost exactly 12%. It really just seems that some fragmentation is unavoidable in a parallel context.

The Great Key Migration

Armed with the results of the benchmarks we had ran, we decided that we would make a gamble. Every table that we had that used a GUID primary key we would alter to contain an auto-incrementing integer primary key, and a GUID UniqueId column with a unique constraint enforced. We would still use the Entity Framework Core GUID value generator to create these unique Ids so to reduce the amount of work SQL would have to do maintaining the unique constraint.

In the end, it took roughly two weeks of work, and by the end we had modified 600 files according to Git. We ran the benchmark again with the new composite keys and our test script outputted the result:

334192 total requests completed
83548 form submissions in 10 minutes across 20 tasks

This absolutely shocked us. We had more than doubled our throughput, obtaining a total boost of ~24% by changing our code to use integer primary keys instead of GUIDs.

Furthermore, our 8 core results showed a near-linear increase of 153,076 submissions, and further analysis showed that the processor wasn't being 100% utilized in this benchmark. Some may say the time investment or the risk involved isn't worth it, but in our minds, the tradeoff we got was more than worth it.

Closing

I'd like to end this post with a couple of acknowledgements.

First, I don't believe that using the sequential id generator strategy is bad. The Entity Framework Core team's benchmarks show that it does great work in a purely sequential workload. As long as you aren't expecting a high degree of parallelism, it seems that they are perfectly fine as a primary key. Even if you do have a parallel workload, its still possible to reorganize your clustered indexes.

Second, I want to acknowledge that its totally possible that this is all a coincidence, and that the GUIDs weren't the cause of the performance issues that we were seeing in SQL Server. It's our belief that it's the culprit. It's also of secondary importance for us to raise awareness that the assumption that we made, that because SequentialGUidValueGenerator uses a strategy optimized for sequential access in SQL server, that GUIDs aren't always going to be inserted sequentially.

Lastly, I encourage anyone who reads this to look into the methods enclosed and run their own benchmarks to draw their own conclusions. If there is a flaw in my methods I'm happy to make an edit or publish a correction.

Thank You!

Thank you for reading my first blog post, please let me know what worked, and what didn't

-- Conner

1 It still perplexes us as to how it didn't show up on the smaller machine. It's possible (spoiler) that since we had less cores we had a lesser degree of parallelism, so rows were not being inserted out of order as bad.