Speeding up Entity Framework Core

Do not be greedy!


When selecting data, you need to choose exactly as many as you need at a time. Never retrieve all data from a table!

Wrong:

using var ctx = new EFCoreTestContext(optionsBuilder.Options);                
//    ID  ,       !
ctx.FederalDistricts.Select(x=> new { x.ID, x.Name, x.ShortName }).ToList();

Correctly:

using var ctx = new EFCoreTestContext(optionsBuilder.Options);  
//     ID     !
ctx.FederalDistricts.Select(x=> new { x.Name, x.ShortName }).ToList();
ctx.FederalDistricts.Select(x => new MyClass { Name = x.Name, ShortName = x.ShortName }).ToList();


Wrong:

var blogs = context.Blog.ToList(); //       . ?
//     ?
var somePost = blogs.FirstOrDefault(x=>x.Title.StartWidth(“Hello world!”));

Correctly:

var somePost = context.Blog.FirstOrDefault(x=>x.Title.StartWidth(“Hello world!”));

Integrated data validation can be performed when a query returns some records.

Wrong:


var blogs = context.Blogs.Where(blog => StandardizeUrl(blog.Url).Contains("dotnet")).ToList();

public static string StandardizeUrl(string url)
{
    url = url.ToLower();
    if (!url.StartsWith("http://"))
    {
        url = string.Concat("http://", url);
    }
    return url;
}

Correctly:

var blogs = context.Blogs.AsEnumerable().Where(blog => StandardizeUrl(blog.Url).Contains("dotnet")).ToList();
 
//  
var blogs = context.Blogs.Where(blog => blog.Contains("dotnet"))
    .OrderByDescending(blog => blog.Rating)
    .Select(blog => new
    {
        Id = blog.BlogId,
        Url = StandardizeUrl(blog.Url)
    })
    .ToList();

Wow, wow, wow, clocked up.

It's time to refresh your knowledge of LINQ techniques a bit.

Let's look at the differences between ToList AsEnumerable AsQueryable


So ToList

  • .
  • .ToList() (lazy loading), .

AsEnumerable

  • (lazy loading)
  • : Func <TSource, bool>
  • ( Where/Take/Skip , , select * from Table1,
  • , N )
  • : Linq-to-SQL + Linq-to-Object.
  • IEnumerable (lazy loading).

AsQueryable

  • (lazy loading)
  • :
    AsQueryable(IEnumerable)  AsQueryable<TElement>(IEnumerable<TElement>) 

  • Expression T-SQL ( ), .
  • DbSet ( Entity Framework) AsQueryable .
  • , Take(5) «select top 5 * SQL» . , SQL , . AsQueryable() , AsEnumerable() T-SQL Linq .
  • Use AsQueryable if you want a database query that can be improved before running on the server side.


An example of using AsQueryable in the simplest case:

public IEnumerable<EmailView> GetEmails(out int totalRecords, Guid? deviceWorkGroupID,
                DateTime? timeStart, DateTime? timeEnd, string search, int? confirmStateID, int? stateTypeID, int? limitOffset, int? limitRowCount, string orderBy, bool desc)
        {
            var r = new List<EmailView>();

            using (var db = new GJobEntities())
            {
                var query = db.Emails.AsQueryable();

                if (timeStart != null && timeEnd != null)
                {
                    query = query.Where(p => p.Created >= timeStart && p.Created <= timeEnd);
                }

                if (stateTypeID != null && stateTypeID > -1)
                {
                    query = query.Where(p => p.EmailStates.OrderByDescending(x => x.AtTime).FirstOrDefault().EmailStateTypeID == stateTypeID);
                }


                if (confirmStateID != null && confirmStateID > -1)
                {
                    var boolValue = confirmStateID == 1 ? true : false;
                    query = query.Where(p => p.IsConfirmed == boolValue);
                }

                if (!string.IsNullOrEmpty(search))
                {
                    search = search.ToLower();
                    query = query.Where(p => (p.Subject + " " + p.CopiesEmails + " " + p.ToEmails + " " + p.FromEmail + " " + p.Body)
                                        .ToLower().Contains(search));
                }

                if (deviceWorkGroupID != Guid.Empty)
                {
                    query = query.Where(x => x.SCEmails.FirstOrDefault().SupportCall.Device.DeviceWorkGroupDevices.FirstOrDefault(p => p.DeviceWorkGroupID == deviceWorkGroupID) != null);
                }

                totalRecords = query.Count();
                query = query.OrderByDescending(p => p.Created);
                if (limitOffset.HasValue)
                {
                    query = query.Skip(limitOffset.Value).Take(limitRowCount.Value);
                }
                var items = query.ToList(); //    

                foreach (var item in items)
                {
                    var n = new EmailView
                    {
                        ID = item.ID,
                        SentTime = item.SentTime,
                        IsConfirmed = item.IsConfirmed,
                        Number = item.Number,
                        Subject = item.Subject,
                        IsDeleted = item.IsDeleted,
                        ToEmails = item.ToEmails,
                        Created = item.Created,
                        CopiesEmails = item.CopiesEmails,
                        FromEmail = item.FromEmail,
                    };

                    //     - 

                    r.Add(n);
                }
            }

            return r;
        }


The magic of simple reading


If you do not need to change data, just display use the .AsNoTracking () method.

Slow sampling

var blogs = context.Blogs.ToList();

Quick fetch (read only)

var blogs = context.Blogs.AsNoTracking().ToList();

Feel you have already warmed up a bit?

Types of loading related data


For those who have forgotten what lazy loading is .

Lazy loading (Lazy loading) means that the associated data transparently loaded from the database when accessing the property of navigation. Read more here .

And at the same time, let me remind you of other types of loading related data.

Active load (Eager loading) means that the associated data is loaded from the database as part of the initial request.

using (var context = new BloggingContext())
{
    var blogs = context.Blogs
        .Include(blog => blog.Posts)
            .ThenInclude(post => post.Author)
                .ThenInclude(author => author.Photo)
        .Include(blog => blog.Owner)
            .ThenInclude(owner => owner.Photo)
        .ToList();
}

Attention! Starting with version EF Core 3.0.0, each Include will cause an additional JOIN to be added to the SQL queries created by relational providers, while previous versions generated additional SQL queries. This can significantly change the performance of your queries, for better or for worse. In particular, LINQ queries with an extremely large number of inclusion statements can be split into several separate LINQ queries.

Explicit loading (Explicit loading) means that the associated data explicitly loaded from the database later.

using (var context = new BloggingContext())
{
    var blog = context.Blogs
        .Single(b => b.BlogId == 1);

    var goodPosts = context.Entry(blog)
        .Collection(b => b.Posts)
        .Query()
        .Where(p => p.Rating > 3)
        .ToList();
}

Jerk and breakthrough! Moving on?

Ready to speed up even more?


To speed up dramatically when fetching complexly structured and even abnormalized data from a relational database, there are two ways to do this: use indexed views (1) or, even better, pre-prepared (calculated) data in a simple flat form for display (2).

(1) Indexed view in the context of MS SQL Server

The indexed view has a unique clustered index. A unique clustered index is stored in SQL Server and is updated like any other clustered index. An indexed view is more significant than standard views, which include complex processing of a large number of rows, for example, aggregating a large amount of data or combining multiple rows.

If such views are often referenced in queries, we can improve performance by creating a unique clustered index for the view. For a standard view, the result set is not stored in the database; instead, the result set is calculated for each query, but in the case of a clustered index, the result set is stored in the database in the same way as a table with a clustered index. Queries that do not specifically use the indexed view may even benefit from the existence of a clustered index from the view.

Representation of an index has a certain cost in the form of productivity. If we create an indexed view, every time we change the data in the base tables, SQL Server must support not only the index records in these tables, but also the index records in the view. In SQL Server editions for developers and enterprises, the optimizer can use view indexes to optimize queries that do not specify an indexed view. However, in other editions of SQL Server, the query must include an indexed view and provide a NOEXPAND hint to take advantage of the index in the view.

(2) If you need to make a request that requires the display of more than three levels of related tables in the amount of three or more with increased CRUDload, the best way would be to periodically calculate the result set, save it in a table and use for display. The resulting table in which the data will be stored should have a Primary Key and indexes on the search fields in LINQ .

What about asynchrony?


Yes! We use it wherever possible! Here is an example:

public void Do()
{
    var myTask = GetFederalDistrictsAsync ();
    foreach (var item in myTask.Result)
    {
         // 
    }
}

public async Task<List<FederalDistrict>> GetFederalDistrictsAsync()
{
    var conn = configurationRoot.GetConnectionString("EFCoreTestContext");
    optionsBuilder.UseSqlServer(conn);
    using var context = new EFCoreTestContext(optionsBuilder.Options);
    return await context.FederalDistricts.ToListAsync();
}

And yes, have you forgotten anything to increase productivity? Buum!

return await context.FederalDistricts.<b>AsNoTracking()</b>.ToListAsync();


Note: the Do () method was added for demonstration purposes only, in order to indicate the operability of the GetFederalDistrictsAsync () method . As my colleagues correctly noted, another example of pure asynchrony is needed.

And let me give it based on the concept of a view component in ASP .NET Core :

//  
public class PopularPosts : ViewComponent
    {
        private readonly IStatsRepository _statsRepository;

        public PopularPosts(IStatsRepository statsRepository)
        {
            _statsRepository = statsRepository;
        }

        public async Task<IViewComponentResult> InvokeAsync()
        {
           //         -
            var federalDistricts = await _statsRepository.GetFederalDistrictsAsync(); 
            var model = new TablePageModel()
            {
                FederalDistricts = federalDistricts,
            };

            return View(model);
        }
    }
    // 
    
    /// <summary>
    ///  -   .... -
    /// </summary>
    public interface IStatsRepository
    {
        /// <summary>
        ///        
        /// </summary>
        /// <returns></returns>
        IEnumerable<FederalDistrict> FederalDistricts();

        /// <summary>
        ///        
	/// !!!
        /// </summary>
        /// <returns></returns>
        Task<List<FederalDistrict>> GetFederalDistrictsAsync();
    }	
	
    /// <summary>
    /// -   .... -
    /// </summary>
    public class StatsRepository : IStatsRepository
    {
        private readonly DbContextOptionsBuilder<EFCoreTestContext>
            optionsBuilder = new DbContextOptionsBuilder<EFCoreTestContext>();
        private readonly IConfigurationRoot configurationRoot;

        public StatsRepository()
        {
            IConfigurationBuilder configurationBuilder = new ConfigurationBuilder()
                    .SetBasePath(Environment.CurrentDirectory)
                    .AddJsonFile("appsettings.json", optional: true, reloadOnChange: true);
            configurationRoot = configurationBuilder.Build();
        }

        public async Task<List<FederalDistrict>> GetFederalDistrictsAsync()
        {
            var conn = configurationRoot.GetConnectionString("EFCoreTestContext");
            optionsBuilder.UseSqlServer(conn);
            using var context = new EFCoreTestContext(optionsBuilder.Options);
            return await context.FederalDistricts.Include(x => x.FederalSubjects).ToListAsync();
        }

        public IEnumerable<FederalDistrict> FederalDistricts()
        {
            var conn = configurationRoot.GetConnectionString("EFCoreTestContext");
            optionsBuilder.UseSqlServer(conn);

            using var ctx = new EFCoreTestContext(optionsBuilder.Options);
            return ctx.FederalDistricts.Include(x => x.FederalSubjects).ToList();
        }
    }

   //         Home\Index 
    <div id="tableContainer">
            @await Component.InvokeAsync("PopularPosts")
     </div>
  //   HTML      Shared\Components\PopularPosts\Default.cshtml



Let me remind you when queries are executed in the Entity Framework Core.

When calling LINQ statements, you simply create a query view in memory. The request is sent to the database only after processing the results.

The following are the most common operations that result in a request being sent to the database.

  • Iterate over the results in a for loop.
  • Using an operator, such as ToList, ToArray, Single, Count.
  • Binding data from query results to the user interface.

How to organize the EF Core code in terms of application architecture?


(1) From the point of view of the architecture of the application, it is necessary to ensure that the access code to your database is isolated / separated in a clearly defined place (in isolation). This allows you to find database code that affects performance.

(2) Do not mix the access code for your database with other parts of the application, such as the user interface or API. Thus, the database access code can be changed without worrying about other issues not related to the database.

How to save data correctly and quickly using SaveChanges ?


If the inserted records are the same, it makes sense to use one save operation for all records.

Wrong

using(var db = new NorthwindEntities())
{
var transaction = db.Database.BeginTransaction();

try
{
    //   1
    var  obj1 = new Customer();
    obj1.CustomerID = "ABCDE";
    obj1.CompanyName = "Company 1";
    obj1.Country = "USA";
    db.Customers.Add(obj1);

  //          db.SaveChanges();

    //   2
    var  obj2 = new Customer();
    obj2.CustomerID = "PQRST";
    obj2.CompanyName = "Company 2";    
    obj2.Country = "USA";
    db.Customers.Add(obj2);

    //   
    db.SaveChanges();

    transaction.Commit();
}
catch
{
    transaction.Rollback();
}
}

Correctly

using(var db = new NorthwindEntities())
{
var transaction = db.Database.BeginTransaction();

try
{
   //  1
    var  obj1 = new Customer();
    obj1.CustomerID = "ABCDE";
    obj1.CompanyName = "Company 1";
    obj1.Country = "USA";
    db.Customers.Add(obj1); 

    //   2
    var  obj2 = new Customer();
    obj2.CustomerID = "PQRST";
    obj2.CompanyName = "Company 2";    
    obj2.Country = "USA";
    db.Customers.Add(obj2);

   //    N 
    db.SaveChanges();

    transaction.Commit();
}
catch
{
    transaction.Rollback();
}
}

There are always exceptions to the rule. If the transaction context is complex, that is, consists of several independent operations, then you can save after each operation. And it’s even more correct to use asynchronous storage in a transaction.

//    
public async Task<IActionResult> AddDepositToHousehold(int householdId, DepositRequestModel model)
{
    using (var transaction = await Context.Database.BeginTransactionAsync(IsolationLevel.Snapshot))
    {
        try
        {
            //     
            var deposit = this.Mapper.Map<Deposit>(model);
            await this.Context.Deposits.AddAsync(deposit);

            await this.Context.SaveChangesAsync();

            //    
               var debtsToPay = await this.Context.Debts.Where(d => d.HouseholdId == householdId && !d.IsPaid).OrderBy(d => d.DateMade).ToListAsync();

            debtsToPay.ForEach(d => d.IsPaid = true);

            await this.Context.SaveChangesAsync();

            //   
            var household = this.Context.Households.FirstOrDefaultAsync(h => h.Id == householdId);

            household.Balance += model.DepositAmount;

            await this.Context.SaveChangesAsync();

            transaction.Commit();
            return this.Ok();
        }
        catch
        {
            transaction.Rollback();
            return this.BadRequest();
        }
    }
}

Triggers, Computed Fields, Custom Functions, and EF Core


To reduce the load on applications containing EF Core, it makes sense to use simple calculated fields and database triggers, but it’s better not to get involved, as the application can be very confusing. But user-defined functions can be very useful especially during fetch operations!

Concurrency in EF Core


If you want to parallelize everything to speed up, then break off: EF Core does not support the execution of several parallel operations in one instance of the context. Wait for one operation to complete before starting the next. To do this, you usually need to specify the await keyword in each asynchronous operation.

EF Core uses asynchronous queries to avoid blocking the flow when executing a query in the database. Asynchronous requests are important to ensure quick user interface response in thick clients. They can also increase throughput in a web application, where you can free up the thread to handle other requests. Here is an example:

public async Task<List<Blog>> GetBlogsAsync()
{
    using (var context = new BloggingContext())
    {
        return await context.Blogs.ToListAsync();
    }
}

What do you know about LINQ compiled queries?


If you have an application that repeatedly performs structurally similar queries in the Entity Framework, you can often improve performance by compiling the query once and executing it several times with different parameters. For example, an application might need to get all customers in a specific city; the city is indicated at runtime by the user in the form. LINQ to Entities supports the use of compiled queries for this purpose.

Starting with the .NET Framework 4.5, LINQ queries are cached automatically. However, you can still use compiled LINQ queries to reduce this cost in subsequent runs, and compiled queries can be more efficient than LINQ queries that are automatically cached. Note that LINQ to Entities queries that apply the Enumerable.Contains operator to in-memory collections are not cached automatically. Also, parameterization of in-memory collections in compiled LINQ queries is not allowed.

Many examples can be found here .

Do not make large DbContext contexts!


In general, I know many of you, if not almost all, of lazy f_u__c_k__e_r__s and the whole database you place in one context, especially this is typical for the Database-First approach. And in vain you do it! The following is an example of how the context can be divided. Of course, connection tables between contexts will have to be duplicated, this is a minus. One way or another, if you have more than 50 tables in the context, it is better to think about its separation.

Using context grouping (pooling DdContext)


The meaning of the DbContext pool is to allow reuse of DbContext instances from the pool, which in some cases can lead to better performance than creating a new instance each time. This is also the main reason for creating a connection pool in ADO.NET, although performance gains for connections will be more significant since connections are generally a harder resource.

using System;
using System.Diagnostics;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.DependencyInjection;

namespace Demos
{
    public class Blog
    {
        public int BlogId { get; set; }
        public string Name { get; set; }
        public string Url { get; set; }
    }

    public class BloggingContext : DbContext
    {
        public static long InstanceCount;

        public BloggingContext(DbContextOptions options)
            : base(options)
            => Interlocked.Increment(ref InstanceCount);

        public DbSet<Blog> Blogs { get; set; }
    }

    public class BlogController
    {
        private readonly BloggingContext _context;

        public BlogController(BloggingContext context) => _context = context;

        public async Task ActionAsync() => await _context.Blogs.FirstAsync();
    }

    public class Startup
    {
        private const string ConnectionString
            = @"Server=(localdb)\mssqllocaldb;Database=Demo.ContextPooling;Integrated Security=True;ConnectRetryCount=0";

        public void ConfigureServices(IServiceCollection services)
        {
            services.AddDbContext<BloggingContext>(c => c.UseSqlServer(ConnectionString));
        }
    }

    public class Program
    {
        private const int Threads = 32;
        private const int Seconds = 10;

        private static long _requestsProcessed;

        private static async Task Main()
        {
            var serviceCollection = new ServiceCollection();
            new Startup().ConfigureServices(serviceCollection);
            var serviceProvider = serviceCollection.BuildServiceProvider();

            SetupDatabase(serviceProvider);

            var stopwatch = new Stopwatch();

            MonitorResults(TimeSpan.FromSeconds(Seconds), stopwatch);

            await Task.WhenAll(
                Enumerable
                    .Range(0, Threads)
                    .Select(_ => SimulateRequestsAsync(serviceProvider, stopwatch)));
        }

        private static void SetupDatabase(IServiceProvider serviceProvider)
        {
            using (var serviceScope = serviceProvider.CreateScope())
            {
                var context = serviceScope.ServiceProvider.GetService<BloggingContext>();

                if (context.Database.EnsureCreated())
                {
                    context.Blogs.Add(new Blog { Name = "The Dog Blog", Url = "http://sample.com/dogs" });
                    context.Blogs.Add(new Blog { Name = "The Cat Blog", Url = "http://sample.com/cats" });
                    context.SaveChanges();
                }
            }
        }
        private static async Task SimulateRequestsAsync(IServiceProvider serviceProvider, Stopwatch stopwatch)
        {
            while (stopwatch.IsRunning)
            {
                using (var serviceScope = serviceProvider.CreateScope())
                {
                    await new BlogController(serviceScope.ServiceProvider.GetService<BloggingContext>()).ActionAsync();
                }

                Interlocked.Increment(ref _requestsProcessed);
            }
        }

        private static async void MonitorResults(TimeSpan duration, Stopwatch stopwatch)
        {
            var lastInstanceCount = 0L;
            var lastRequestCount = 0L;
            var lastElapsed = TimeSpan.Zero;

            stopwatch.Start();

            while (stopwatch.Elapsed < duration)
            {
                await Task.Delay(TimeSpan.FromSeconds(1));

                var instanceCount = BloggingContext.InstanceCount;
                var requestCount = _requestsProcessed;
                var elapsed = stopwatch.Elapsed;
                var currentElapsed = elapsed - lastElapsed;
                var currentRequests = requestCount - lastRequestCount;

                Console.WriteLine(
                    $"[{DateTime.Now:HH:mm:ss.fff}] "
                    + $"Context creations/second: {instanceCount - lastInstanceCount} | "
                    + $"Requests/second: {Math.Round(currentRequests / currentElapsed.TotalSeconds)}");

                lastInstanceCount = instanceCount;
                lastRequestCount = requestCount;
                lastElapsed = elapsed;
            }

            Console.WriteLine();
            Console.WriteLine($"Total context creations: {BloggingContext.InstanceCount}");
            Console.WriteLine(
                $"Requests per second:     {Math.Round(_requestsProcessed / stopwatch.Elapsed.TotalSeconds)}");

            stopwatch.Stop();
        }

How to avoid unnecessary errors with CRUD in EF Core?


Never do insert calculations in the same code. Always separate the formation / preparation of an object and its insertion / updating. Just spread it by function: checking the data entered by the user, calculating the necessary preliminary data, mapping or creating the object, and the actual CRUD operation.

What to do when things are really bad with application performance?


Beer will definitely not help here. But what will help is the separation of reading and writing in the application architecture, followed by the allocation of these operations on sockets. Think about using the Command and Query Responsibility Segregation (CQRS) pattern , and also try splitting tables into insert and read between two databases.

Speed ​​applications to you, friends and colleagues!

Source: https://habr.com/ru/post/undefined/


All Articles