Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

LiteDatabase.Shrink throws 'System.Collections.Generic.KeyNotFoundException' #892

Closed
shorstok opened this issue Feb 5, 2018 · 18 comments
Closed

Comments

@shorstok
Copy link

shorstok commented Feb 5, 2018

I get KeyNotFoundException upon any attempt to load/view database.
Database file that raises KeyNotFoundException is generated by more than one process accessing one database, so this may be concurrency issue.

Snippet to reproduce:

using (var database = new LiteDatabase("sample.litedb"))
            {
                database.Shrink();
            }

or

            using (var database = new LiteDatabase("sample.litedb"))
            {
                database.Engine.FindAll(database.GetCollectionNames().FirstOrDefault()).ToArray();               
            }

etc.

Database file in question included:
sample.zip

Please tell me what the possible cause is and how to avoid it in future.

@shorstok
Copy link
Author

shorstok commented Feb 9, 2018

@mbdavid any updates on this one? maybe you need extra context or some other info?

@mbdavid
Copy link
Collaborator

mbdavid commented Feb 9, 2018

Hi @shorstok, sorry, but I will check you case this weekend. During week days is more complicate debug some codes. I will return soon with an answert

@mbdavid
Copy link
Collaborator

mbdavid commented Feb 9, 2018

Hi @shorstok, your collection UnreadNotificationRecord seems corrupted. Are your using LiteDB v4? How is your use about this collection? How concurrency are using?

@shorstok
Copy link
Author

We're using LiteDB v 4.1.1

Actually, after ~2 days of trial and error, i'm able to reproduce database corruption in isolated test. It may seem a little contrived (I took queries and record structure from our original desktop app for reproduction), but it breaks database pretty reliably.

Key points I guess:

  • many processes accessing one database
  • mixed find queries and insert queries
  • database file on NON-SSD drive (e.g. good old HDD)

Sample below is a console app, that, when launched, spawns 15 copies of itself working with single database.

Maybe it would be necessary to run it 3 or 4 times to get error, but I doubt more than 5.

using System;
using System.Diagnostics;
using System.Linq;
using LiteDB;

namespace litedb_test
{
    /// <summary>
    /// Test record from desktop app
    /// </summary>

    public class UnreadNotificationRecord
    {
        public enum NotificationTypeEnum
        {
            Info,
            Error
        }

        [BsonId]
        public int Id { get; set; }

        public Guid UserId { get; set; }

        public string Title { get; set; }
        public string Message { get; set; }
        public NotificationTypeEnum NotificationType { get; set; }

        public DateTime When { get; set; }
    }



    /// <summary>
    /// 
    /// </summary>

    internal class Program    {

        /// <summary>
        /// Defines the entry point of the application.
        /// </summary>
        /// <param name="args">The args.</param>
        [STAThread]
        static void Main(string[] args)
        {
            /*
             * Important!: 
             * `connectionString`
             * has to be path to **NON-SSD** drive
             */

            const string connectionString = "f://tmp//generated.litedb";

            //Some GUID keys to share across all processes
            Guid[] sharedGuids = {
                Guid.Parse("B9321547-D4BE-461F-B7F9-2E2600839428"),
                Guid.Parse("1F0689E8-121A-414D-80D1-2A54B516A6AC")
            };

            if (args.Length == 0)
            {
                const int processCount = 15;

                Console.WriteLine($"Spawning {processCount} child processes");

                for (int i = 0; i < processCount; i++)
                    Process.Start(Process.GetCurrentProcess().MainModule.FileName, $"child_{i}");

                return;
            }

            var procId = args[0];
            
            Console.WriteLine($"Running as `{procId}`");

            using (var database = new LiteDatabase(connectionString))
            {
                database.Shrink();

                var collection = database.GetCollection<UnreadNotificationRecord>();
                collection.EnsureIndex(x => x.UserId);

                for (int i = 0; i < 50; i++)
                {
                    var random = new Random();

                    var record = new UnreadNotificationRecord
                    {
                        UserId = sharedGuids[random.Next() % sharedGuids.Length],
                    };

                    Console.WriteLine($"Item[{i}]: {procId}");

                    //Every 2nd iteration run some query that actually has to yield some results

                    if (i % 2 == 0)
                        collection.
                            Find(Query.EQ(nameof(UnreadNotificationRecord.UserId), sharedGuids[random.Next() % sharedGuids.Length])).
                            ToArray();

                    //Every iteration insert new record

                    collection.Insert(record);
                }

                Console.WriteLine($"{procId} process finished");
                
            }            
        }

  
    }


}

@mbdavid
Copy link
Collaborator

mbdavid commented Feb 10, 2018

Hi @shorstok, thanks too much for this code. I will try get an hdd disk to test. Here I have only ssd. If this occurs only in HDD can be somethink about how normal disk are write. In master branch I added a new parameter in string connection (flush=true). With this, all write operation will flush just after finish - will very slow solution (in HDD, not in SSD), but less OS dependency.

Also, shrink operation create another database (in memory) and than copy all data from this new file overwritten original datafile. This override can be loose file lock (I will test this).

I will try all options with your code.

@shorstok
Copy link
Author

@mbdavid removing database.Shrink alone makes everything run without errors in my sample, so you may be right about loosing file lock.

mbdavid added a commit that referenced this issue Feb 10, 2018
@mbdavid
Copy link
Collaborator

mbdavid commented Feb 10, 2018

A single line fix for me. I was in shrink method.

But, in one of many attempt, I got _id duplicate key - I still investigate this.

And again, thanks for you code: it would be extremely difficult to find the error without your code

@shorstok
Copy link
Author

@mbdavid you're welcome :)

@Bounz
Copy link

Bounz commented Mar 8, 2018

Hi @mbdavid.
I caught the same issue with KeyNotFoundException exception (also during Shrink method call).
And when this issue happens new temp files are being created, see screenshot.

image

@mbdavid
Copy link
Collaborator

mbdavid commented Mar 9, 2018

Hi @Bounz, this bug was fixed but it's not on nuget... I forgot about him.. 😨 I will release new version this weekend.

@md-zamolxis
Copy link

Hi @mbdavid, is it fixed on 4.1.4 version (last one)? For me it throws KeyNotFoundException in both 4.1.1.0 and 4.1.4.0 versions.

@mbdavid
Copy link
Collaborator

mbdavid commented Jun 26, 2018

Hi @md-zamolxis, yes, that bug was fixed in 4.1.3.... if you getting in 4.1.4 maybe it's another problem. Is your datafile has private data? Can be shared to me to find whats going on?

@md-zamolxis
Copy link

Just checked - it does not contain private data.

@mbdavid
Copy link
Collaborator

mbdavid commented Jun 26, 2018

@md-zamolxis, which version of LiteDB do you use to create this database? You have a corrupted IndexNode link pointer.... maybe from and old v4 version.

@md-zamolxis
Copy link

@mbdavid, this database was created by 4.1.1.0 version and I tried to shrink it in both versions.

@mbdavid
Copy link
Collaborator

mbdavid commented Jun 26, 2018

@md-zamolxis, your problem is not about shrink, your data is corrupted. I made this code and got an exception about IndexNode:

using (var db = new LiteEngine(@"C:\Temp\SessionDatabase.ldb"))
{
    foreach (var col in db.GetCollectionNames())
    {
        db.FindAll(col).ToList();
    }
}

I will try recovery your data... and run over 4.1.4

@md-zamolxis
Copy link

@mbdavid, thanks a lot, maybe it is broken, but on my live env everything works as expected.

@litedb-org litedb-org deleted a comment from md-zamolxis Jun 27, 2018
@mbdavid
Copy link
Collaborator

mbdavid commented Jun 27, 2018

Hi @md-zamolxis, I made a Recovery static method in LiteEngine to recovery datafile with link-pointer missing (like your example). It's not 100% but works fine in my first example.

So, I recovery your data and works fine in master branch (with shrink too). You can get lastest version of LiteDB and run over your datafile to check. If you got any problem, please, open another issue about this.

@mbdavid mbdavid closed this as completed Jun 27, 2018
github-actions bot pushed a commit to Reddevildragg-UPM-Forks/LiteDB that referenced this issue Nov 18, 2020
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

4 participants