COUNT(*) or MAX(id) - which is faster?How to efficiently count the number of keys/properties of an object in JavaScript?Which “href” value should I use for JavaScript links, “#” or “javascript:void(0)”?Which is faster: Stack allocation or Heap allocationSQL select only rows with max value on a columnWhy are elementwise additions much faster in separate loops than in a combined loop?Why is it faster to process a sorted array than an unsorted array?Why does Python code run faster in a function?Is < faster than <=?Which is faster: while(1) or while(2)?Why is [] faster than list()?

If a centaur druid Wild Shapes into a Giant Elk, do their Charge features stack?

Does bootstrapped regression allow for inference?

What to wear for invited talk in Canada

Is there a name of the flying bionic bird?

Can I find out the caloric content of bread by dehydrating it?

Is this food a bread or a loaf?

Lied on resume at previous job

Is "plugging out" electronic devices an American expression?

COUNT(*) or MAX(id) - which is faster?

What is it called when one voice type sings a 'solo'?

How did the USSR manage to innovate in an environment characterized by government censorship and high bureaucracy?

Why is my log file so massive? 22gb. I am running log backups

What is the command to reset a PC without deleting any files

Filling an area between two curves

Where else does the Shulchan Aruch quote an authority by name?

How to manage monthly salary

Domain expired, GoDaddy holds it and is asking more money

Is domain driven design an anti-SQL pattern?

What happens when a metallic dragon and a chromatic dragon mate?

extract characters between two commas?

Unbreakable Formation vs. Cry of the Carnarium

Where to refill my bottle in India?

Manga about a female worker who got dragged into another world together with this high school girl and she was just told she's not needed anymore

What is the meaning of "of trouble" in the following sentence?



COUNT(*) or MAX(id) - which is faster?


How to efficiently count the number of keys/properties of an object in JavaScript?Which “href” value should I use for JavaScript links, “#” or “javascript:void(0)”?Which is faster: Stack allocation or Heap allocationSQL select only rows with max value on a columnWhy are elementwise additions much faster in separate loops than in a combined loop?Why is it faster to process a sorted array than an unsorted array?Why does Python code run faster in a function?Is < faster than <=?Which is faster: while(1) or while(2)?Why is [] faster than list()?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








10















i have a web server, that has my own messaging system implemented.
I am at phase, when i need to create API, that checks, if the user has new message(s).
My DB table is simple:



ID - Auto Increment, Primary Key (Bigint)
Sender - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Recipient - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Message - Varchar (256) //UTF8 BIN


I am considering to make an api, that will estimate, if there are new messages for given user. I am thinking to use one of these methods:



A) Select count(*) of messages where sender or recipient is me.

(if this number > previous number, I have new message)



B) Select max(ID) of messages where sender or recipient is me.

(if max(ID) > than previous number, I have new message)



My question is: Can i calculate somehow, what method will consume less server resources? Or is there some article? Maybe another method i not mentioned?










share|improve this question



















  • 3





    I think you would be better off by adding a timestamp column and checking against that value to see if there are newer records.

    – Dharman
    9 hours ago











  • Either querying a timestamp or the ID, use MAX() on that column, and make sure it's indexed with (user_id, timestamp).

    – The Impaler
    9 hours ago











  • @Dharman i was thinking of it. But it costs extra DB space, also i am not sure if it will be faster than one of my methods. I am storing the simple number (of current messages) in usernames table

    – FeHora
    9 hours ago






  • 1





    Calculate? No idea. But you can measure it. Fire off a few thousands of each query and watch machine metrics (cpu%, mem%, load average, etc.)

    – Sergio Tulentsev
    9 hours ago






  • 1





    While there is a good answer to this question below, I suspect you might be optimizing on something that turns out not to be important. And unless you anticipate having literally millions of messages, I wouldn't worry about disk space, especially because the timestamp is small compared to your other fields. If you add timestamps, your table will be about 5MB larger for each million messages. That's really nothing.

    – Jerry
    8 hours ago

















10















i have a web server, that has my own messaging system implemented.
I am at phase, when i need to create API, that checks, if the user has new message(s).
My DB table is simple:



ID - Auto Increment, Primary Key (Bigint)
Sender - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Recipient - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Message - Varchar (256) //UTF8 BIN


I am considering to make an api, that will estimate, if there are new messages for given user. I am thinking to use one of these methods:



A) Select count(*) of messages where sender or recipient is me.

(if this number > previous number, I have new message)



B) Select max(ID) of messages where sender or recipient is me.

(if max(ID) > than previous number, I have new message)



My question is: Can i calculate somehow, what method will consume less server resources? Or is there some article? Maybe another method i not mentioned?










share|improve this question



















  • 3





    I think you would be better off by adding a timestamp column and checking against that value to see if there are newer records.

    – Dharman
    9 hours ago











  • Either querying a timestamp or the ID, use MAX() on that column, and make sure it's indexed with (user_id, timestamp).

    – The Impaler
    9 hours ago











  • @Dharman i was thinking of it. But it costs extra DB space, also i am not sure if it will be faster than one of my methods. I am storing the simple number (of current messages) in usernames table

    – FeHora
    9 hours ago






  • 1





    Calculate? No idea. But you can measure it. Fire off a few thousands of each query and watch machine metrics (cpu%, mem%, load average, etc.)

    – Sergio Tulentsev
    9 hours ago






  • 1





    While there is a good answer to this question below, I suspect you might be optimizing on something that turns out not to be important. And unless you anticipate having literally millions of messages, I wouldn't worry about disk space, especially because the timestamp is small compared to your other fields. If you add timestamps, your table will be about 5MB larger for each million messages. That's really nothing.

    – Jerry
    8 hours ago













10












10








10


2






i have a web server, that has my own messaging system implemented.
I am at phase, when i need to create API, that checks, if the user has new message(s).
My DB table is simple:



ID - Auto Increment, Primary Key (Bigint)
Sender - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Recipient - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Message - Varchar (256) //UTF8 BIN


I am considering to make an api, that will estimate, if there are new messages for given user. I am thinking to use one of these methods:



A) Select count(*) of messages where sender or recipient is me.

(if this number > previous number, I have new message)



B) Select max(ID) of messages where sender or recipient is me.

(if max(ID) > than previous number, I have new message)



My question is: Can i calculate somehow, what method will consume less server resources? Or is there some article? Maybe another method i not mentioned?










share|improve this question
















i have a web server, that has my own messaging system implemented.
I am at phase, when i need to create API, that checks, if the user has new message(s).
My DB table is simple:



ID - Auto Increment, Primary Key (Bigint)
Sender - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Recipient - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Message - Varchar (256) //UTF8 BIN


I am considering to make an api, that will estimate, if there are new messages for given user. I am thinking to use one of these methods:



A) Select count(*) of messages where sender or recipient is me.

(if this number > previous number, I have new message)



B) Select max(ID) of messages where sender or recipient is me.

(if max(ID) > than previous number, I have new message)



My question is: Can i calculate somehow, what method will consume less server resources? Or is there some article? Maybe another method i not mentioned?







php mysql performance






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 5 hours ago









Peter Cordes

134k18203342




134k18203342










asked 9 hours ago









FeHoraFeHora

586




586







  • 3





    I think you would be better off by adding a timestamp column and checking against that value to see if there are newer records.

    – Dharman
    9 hours ago











  • Either querying a timestamp or the ID, use MAX() on that column, and make sure it's indexed with (user_id, timestamp).

    – The Impaler
    9 hours ago











  • @Dharman i was thinking of it. But it costs extra DB space, also i am not sure if it will be faster than one of my methods. I am storing the simple number (of current messages) in usernames table

    – FeHora
    9 hours ago






  • 1





    Calculate? No idea. But you can measure it. Fire off a few thousands of each query and watch machine metrics (cpu%, mem%, load average, etc.)

    – Sergio Tulentsev
    9 hours ago






  • 1





    While there is a good answer to this question below, I suspect you might be optimizing on something that turns out not to be important. And unless you anticipate having literally millions of messages, I wouldn't worry about disk space, especially because the timestamp is small compared to your other fields. If you add timestamps, your table will be about 5MB larger for each million messages. That's really nothing.

    – Jerry
    8 hours ago












  • 3





    I think you would be better off by adding a timestamp column and checking against that value to see if there are newer records.

    – Dharman
    9 hours ago











  • Either querying a timestamp or the ID, use MAX() on that column, and make sure it's indexed with (user_id, timestamp).

    – The Impaler
    9 hours ago











  • @Dharman i was thinking of it. But it costs extra DB space, also i am not sure if it will be faster than one of my methods. I am storing the simple number (of current messages) in usernames table

    – FeHora
    9 hours ago






  • 1





    Calculate? No idea. But you can measure it. Fire off a few thousands of each query and watch machine metrics (cpu%, mem%, load average, etc.)

    – Sergio Tulentsev
    9 hours ago






  • 1





    While there is a good answer to this question below, I suspect you might be optimizing on something that turns out not to be important. And unless you anticipate having literally millions of messages, I wouldn't worry about disk space, especially because the timestamp is small compared to your other fields. If you add timestamps, your table will be about 5MB larger for each million messages. That's really nothing.

    – Jerry
    8 hours ago







3




3





I think you would be better off by adding a timestamp column and checking against that value to see if there are newer records.

– Dharman
9 hours ago





I think you would be better off by adding a timestamp column and checking against that value to see if there are newer records.

– Dharman
9 hours ago













Either querying a timestamp or the ID, use MAX() on that column, and make sure it's indexed with (user_id, timestamp).

– The Impaler
9 hours ago





Either querying a timestamp or the ID, use MAX() on that column, and make sure it's indexed with (user_id, timestamp).

– The Impaler
9 hours ago













@Dharman i was thinking of it. But it costs extra DB space, also i am not sure if it will be faster than one of my methods. I am storing the simple number (of current messages) in usernames table

– FeHora
9 hours ago





@Dharman i was thinking of it. But it costs extra DB space, also i am not sure if it will be faster than one of my methods. I am storing the simple number (of current messages) in usernames table

– FeHora
9 hours ago




1




1





Calculate? No idea. But you can measure it. Fire off a few thousands of each query and watch machine metrics (cpu%, mem%, load average, etc.)

– Sergio Tulentsev
9 hours ago





Calculate? No idea. But you can measure it. Fire off a few thousands of each query and watch machine metrics (cpu%, mem%, load average, etc.)

– Sergio Tulentsev
9 hours ago




1




1





While there is a good answer to this question below, I suspect you might be optimizing on something that turns out not to be important. And unless you anticipate having literally millions of messages, I wouldn't worry about disk space, especially because the timestamp is small compared to your other fields. If you add timestamps, your table will be about 5MB larger for each million messages. That's really nothing.

– Jerry
8 hours ago





While there is a good answer to this question below, I suspect you might be optimizing on something that turns out not to be important. And unless you anticipate having literally millions of messages, I wouldn't worry about disk space, especially because the timestamp is small compared to your other fields. If you add timestamps, your table will be about 5MB larger for each million messages. That's really nothing.

– Jerry
8 hours ago












4 Answers
4






active

oldest

votes


















13














In MySQL InnoDB, SELECT COUNT(*) WHERE secondary_index = ? is an expensive operation and when the user has a lot of messages, this query might take a long time. Even when using an index, the engine still needs to count all matching records.



On the other hand, SELECT MAX(id) WHERE secondary_index = ? can deliver the highest id in that index very efficiently and runs in constant speed by doing a so-called loose index scan.



If you want to understand why, consider looking up the "B-Tree+" data structure which InnoDB uses to organise its data.



I suggest you go with SELECT MAX(id), if the requirement is only to check if there are new messages (and not the count of them).



Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?






share|improve this answer

























  • refer: dba.stackexchange.com/questions/130780/mysql-count-performance

    – Kaii
    9 hours ago






  • 1





    "SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's a where on an unindexed field.

    – Sergio Tulentsev
    9 hours ago











  • @SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.

    – FeHora
    9 hours ago







  • 4





    If there's an index on a, then SELECT MAX(id) FROM tbl WHERE a=constant uses a so-called loose index scan. Those are almost miraculously fast. SELECT COUNT(*) FROM tbl WHERE a=constant does a tight index scan, which is not as fast.

    – O. Jones
    9 hours ago







  • 1





    @FeHora i strongly suggest to setup some sort of test environment, a database with generated records for you to play with.

    – Kaii
    8 hours ago



















1














To have the information that someone has new messages - do exactly that. Update the field in users table (I'm assuming that's the name) when a new message is recorded in the system. You have the recipient's ID, that's all you need. You can create an after insert trigger (assumption: there's users2messages table) that updates users table with a boolean flag indicating there's a message.



This approach is by far faster than counting indexes, be the index primary or secondary. When the user performs an action, you can update the users table with has_messages = 0, when a new message arrives - you update the table with has_messages = 1. It's simple, it works, it scales and using triggers to maintain it makes it easy and seamless.
I'm sure there will be nay-sayers who don't like triggers, you can do it manually at the point of associating a user with a new message.






share|improve this answer























  • triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.

    – Kaii
    8 hours ago












  • @Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select

    – FeHora
    8 hours ago






  • 1





    @Kaii SELECT has_messages FROM users WHERE id = 1; is the fastest query there is. It's an eq_ref which is infinitely faster than counting a number of records in the table. The boolean field is not in the WHERE clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.

    – Mjh
    8 hours ago



















0














If you need to know the number of new messages then using
Select count(*) from Messages where user_id in (sender, recipient) and id > last_seen_id would be your best option.



I'm a fan of using exists where possible, so to determine IF there are new messages, my query would be Select exists(Select 1 from Messages where user_id in (sender, recipient) and id > last_seen_id). The benefit of exists is that as soon as it finds 1 record it returns true.



Edit: To avoid any confusion in reading this answer, both of those queries would also include a check for other_user_id in (sender, recipient) in order to only return the messages between 2 specific users.






share|improve this answer
































    0














    @FeHora You talk about not using keys to save db space. The table designs wastes more db space.



    ID - Auto Increment, Primary Key (Bigint)


    Is bigint really necessary? Let us assume, the a message is send every second. The a int unsigned is enough for 126 years. And if you have really so much messages, a key is mandatory.



    Sender - Varchar (32) // Foreign Key to UserID hash from Users DB Table
    Recipient - Varchar (32) // Foreign Key to UserID hash from Users DB Table


    Why not using the UserID (usually an int unsigned).



    Then I would add a seen flags. Btw, you can add for all filed the attribute not null.



    seen tinyint not NULL.


    Last not least I recomment the variant of @Mjh : Define a flag has_messages, or new_messages, or both in the user record. Usually, the user record is loaded so it is NOT an additional database query.





    share























      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55581114%2fcount-or-maxid-which-is-faster%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      4 Answers
      4






      active

      oldest

      votes








      4 Answers
      4






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      13














      In MySQL InnoDB, SELECT COUNT(*) WHERE secondary_index = ? is an expensive operation and when the user has a lot of messages, this query might take a long time. Even when using an index, the engine still needs to count all matching records.



      On the other hand, SELECT MAX(id) WHERE secondary_index = ? can deliver the highest id in that index very efficiently and runs in constant speed by doing a so-called loose index scan.



      If you want to understand why, consider looking up the "B-Tree+" data structure which InnoDB uses to organise its data.



      I suggest you go with SELECT MAX(id), if the requirement is only to check if there are new messages (and not the count of them).



      Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?






      share|improve this answer

























      • refer: dba.stackexchange.com/questions/130780/mysql-count-performance

        – Kaii
        9 hours ago






      • 1





        "SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's a where on an unindexed field.

        – Sergio Tulentsev
        9 hours ago











      • @SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.

        – FeHora
        9 hours ago







      • 4





        If there's an index on a, then SELECT MAX(id) FROM tbl WHERE a=constant uses a so-called loose index scan. Those are almost miraculously fast. SELECT COUNT(*) FROM tbl WHERE a=constant does a tight index scan, which is not as fast.

        – O. Jones
        9 hours ago







      • 1





        @FeHora i strongly suggest to setup some sort of test environment, a database with generated records for you to play with.

        – Kaii
        8 hours ago
















      13














      In MySQL InnoDB, SELECT COUNT(*) WHERE secondary_index = ? is an expensive operation and when the user has a lot of messages, this query might take a long time. Even when using an index, the engine still needs to count all matching records.



      On the other hand, SELECT MAX(id) WHERE secondary_index = ? can deliver the highest id in that index very efficiently and runs in constant speed by doing a so-called loose index scan.



      If you want to understand why, consider looking up the "B-Tree+" data structure which InnoDB uses to organise its data.



      I suggest you go with SELECT MAX(id), if the requirement is only to check if there are new messages (and not the count of them).



      Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?






      share|improve this answer

























      • refer: dba.stackexchange.com/questions/130780/mysql-count-performance

        – Kaii
        9 hours ago






      • 1





        "SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's a where on an unindexed field.

        – Sergio Tulentsev
        9 hours ago











      • @SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.

        – FeHora
        9 hours ago







      • 4





        If there's an index on a, then SELECT MAX(id) FROM tbl WHERE a=constant uses a so-called loose index scan. Those are almost miraculously fast. SELECT COUNT(*) FROM tbl WHERE a=constant does a tight index scan, which is not as fast.

        – O. Jones
        9 hours ago







      • 1





        @FeHora i strongly suggest to setup some sort of test environment, a database with generated records for you to play with.

        – Kaii
        8 hours ago














      13












      13








      13







      In MySQL InnoDB, SELECT COUNT(*) WHERE secondary_index = ? is an expensive operation and when the user has a lot of messages, this query might take a long time. Even when using an index, the engine still needs to count all matching records.



      On the other hand, SELECT MAX(id) WHERE secondary_index = ? can deliver the highest id in that index very efficiently and runs in constant speed by doing a so-called loose index scan.



      If you want to understand why, consider looking up the "B-Tree+" data structure which InnoDB uses to organise its data.



      I suggest you go with SELECT MAX(id), if the requirement is only to check if there are new messages (and not the count of them).



      Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?






      share|improve this answer















      In MySQL InnoDB, SELECT COUNT(*) WHERE secondary_index = ? is an expensive operation and when the user has a lot of messages, this query might take a long time. Even when using an index, the engine still needs to count all matching records.



      On the other hand, SELECT MAX(id) WHERE secondary_index = ? can deliver the highest id in that index very efficiently and runs in constant speed by doing a so-called loose index scan.



      If you want to understand why, consider looking up the "B-Tree+" data structure which InnoDB uses to organise its data.



      I suggest you go with SELECT MAX(id), if the requirement is only to check if there are new messages (and not the count of them).



      Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited 9 hours ago

























      answered 9 hours ago









      KaiiKaii

      15.7k22951




      15.7k22951












      • refer: dba.stackexchange.com/questions/130780/mysql-count-performance

        – Kaii
        9 hours ago






      • 1





        "SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's a where on an unindexed field.

        – Sergio Tulentsev
        9 hours ago











      • @SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.

        – FeHora
        9 hours ago







      • 4





        If there's an index on a, then SELECT MAX(id) FROM tbl WHERE a=constant uses a so-called loose index scan. Those are almost miraculously fast. SELECT COUNT(*) FROM tbl WHERE a=constant does a tight index scan, which is not as fast.

        – O. Jones
        9 hours ago







      • 1





        @FeHora i strongly suggest to setup some sort of test environment, a database with generated records for you to play with.

        – Kaii
        8 hours ago


















      • refer: dba.stackexchange.com/questions/130780/mysql-count-performance

        – Kaii
        9 hours ago






      • 1





        "SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's a where on an unindexed field.

        – Sergio Tulentsev
        9 hours ago











      • @SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.

        – FeHora
        9 hours ago







      • 4





        If there's an index on a, then SELECT MAX(id) FROM tbl WHERE a=constant uses a so-called loose index scan. Those are almost miraculously fast. SELECT COUNT(*) FROM tbl WHERE a=constant does a tight index scan, which is not as fast.

        – O. Jones
        9 hours ago







      • 1





        @FeHora i strongly suggest to setup some sort of test environment, a database with generated records for you to play with.

        – Kaii
        8 hours ago

















      refer: dba.stackexchange.com/questions/130780/mysql-count-performance

      – Kaii
      9 hours ago





      refer: dba.stackexchange.com/questions/130780/mysql-count-performance

      – Kaii
      9 hours ago




      1




      1





      "SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's a where on an unindexed field.

      – Sergio Tulentsev
      9 hours ago





      "SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's a where on an unindexed field.

      – Sergio Tulentsev
      9 hours ago













      @SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.

      – FeHora
      9 hours ago






      @SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.

      – FeHora
      9 hours ago





      4




      4





      If there's an index on a, then SELECT MAX(id) FROM tbl WHERE a=constant uses a so-called loose index scan. Those are almost miraculously fast. SELECT COUNT(*) FROM tbl WHERE a=constant does a tight index scan, which is not as fast.

      – O. Jones
      9 hours ago






      If there's an index on a, then SELECT MAX(id) FROM tbl WHERE a=constant uses a so-called loose index scan. Those are almost miraculously fast. SELECT COUNT(*) FROM tbl WHERE a=constant does a tight index scan, which is not as fast.

      – O. Jones
      9 hours ago





      1




      1





      @FeHora i strongly suggest to setup some sort of test environment, a database with generated records for you to play with.

      – Kaii
      8 hours ago






      @FeHora i strongly suggest to setup some sort of test environment, a database with generated records for you to play with.

      – Kaii
      8 hours ago














      1














      To have the information that someone has new messages - do exactly that. Update the field in users table (I'm assuming that's the name) when a new message is recorded in the system. You have the recipient's ID, that's all you need. You can create an after insert trigger (assumption: there's users2messages table) that updates users table with a boolean flag indicating there's a message.



      This approach is by far faster than counting indexes, be the index primary or secondary. When the user performs an action, you can update the users table with has_messages = 0, when a new message arrives - you update the table with has_messages = 1. It's simple, it works, it scales and using triggers to maintain it makes it easy and seamless.
      I'm sure there will be nay-sayers who don't like triggers, you can do it manually at the point of associating a user with a new message.






      share|improve this answer























      • triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.

        – Kaii
        8 hours ago












      • @Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select

        – FeHora
        8 hours ago






      • 1





        @Kaii SELECT has_messages FROM users WHERE id = 1; is the fastest query there is. It's an eq_ref which is infinitely faster than counting a number of records in the table. The boolean field is not in the WHERE clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.

        – Mjh
        8 hours ago
















      1














      To have the information that someone has new messages - do exactly that. Update the field in users table (I'm assuming that's the name) when a new message is recorded in the system. You have the recipient's ID, that's all you need. You can create an after insert trigger (assumption: there's users2messages table) that updates users table with a boolean flag indicating there's a message.



      This approach is by far faster than counting indexes, be the index primary or secondary. When the user performs an action, you can update the users table with has_messages = 0, when a new message arrives - you update the table with has_messages = 1. It's simple, it works, it scales and using triggers to maintain it makes it easy and seamless.
      I'm sure there will be nay-sayers who don't like triggers, you can do it manually at the point of associating a user with a new message.






      share|improve this answer























      • triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.

        – Kaii
        8 hours ago












      • @Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select

        – FeHora
        8 hours ago






      • 1





        @Kaii SELECT has_messages FROM users WHERE id = 1; is the fastest query there is. It's an eq_ref which is infinitely faster than counting a number of records in the table. The boolean field is not in the WHERE clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.

        – Mjh
        8 hours ago














      1












      1








      1







      To have the information that someone has new messages - do exactly that. Update the field in users table (I'm assuming that's the name) when a new message is recorded in the system. You have the recipient's ID, that's all you need. You can create an after insert trigger (assumption: there's users2messages table) that updates users table with a boolean flag indicating there's a message.



      This approach is by far faster than counting indexes, be the index primary or secondary. When the user performs an action, you can update the users table with has_messages = 0, when a new message arrives - you update the table with has_messages = 1. It's simple, it works, it scales and using triggers to maintain it makes it easy and seamless.
      I'm sure there will be nay-sayers who don't like triggers, you can do it manually at the point of associating a user with a new message.






      share|improve this answer













      To have the information that someone has new messages - do exactly that. Update the field in users table (I'm assuming that's the name) when a new message is recorded in the system. You have the recipient's ID, that's all you need. You can create an after insert trigger (assumption: there's users2messages table) that updates users table with a boolean flag indicating there's a message.



      This approach is by far faster than counting indexes, be the index primary or secondary. When the user performs an action, you can update the users table with has_messages = 0, when a new message arrives - you update the table with has_messages = 1. It's simple, it works, it scales and using triggers to maintain it makes it easy and seamless.
      I'm sure there will be nay-sayers who don't like triggers, you can do it manually at the point of associating a user with a new message.







      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered 8 hours ago









      MjhMjh

      1,98911113




      1,98911113












      • triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.

        – Kaii
        8 hours ago












      • @Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select

        – FeHora
        8 hours ago






      • 1





        @Kaii SELECT has_messages FROM users WHERE id = 1; is the fastest query there is. It's an eq_ref which is infinitely faster than counting a number of records in the table. The boolean field is not in the WHERE clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.

        – Mjh
        8 hours ago


















      • triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.

        – Kaii
        8 hours ago












      • @Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select

        – FeHora
        8 hours ago






      • 1





        @Kaii SELECT has_messages FROM users WHERE id = 1; is the fastest query there is. It's an eq_ref which is infinitely faster than counting a number of records in the table. The boolean field is not in the WHERE clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.

        – Mjh
        8 hours ago

















      triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.

      – Kaii
      8 hours ago






      triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.

      – Kaii
      8 hours ago














      @Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select

      – FeHora
      8 hours ago





      @Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select

      – FeHora
      8 hours ago




      1




      1





      @Kaii SELECT has_messages FROM users WHERE id = 1; is the fastest query there is. It's an eq_ref which is infinitely faster than counting a number of records in the table. The boolean field is not in the WHERE clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.

      – Mjh
      8 hours ago






      @Kaii SELECT has_messages FROM users WHERE id = 1; is the fastest query there is. It's an eq_ref which is infinitely faster than counting a number of records in the table. The boolean field is not in the WHERE clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.

      – Mjh
      8 hours ago












      0














      If you need to know the number of new messages then using
      Select count(*) from Messages where user_id in (sender, recipient) and id > last_seen_id would be your best option.



      I'm a fan of using exists where possible, so to determine IF there are new messages, my query would be Select exists(Select 1 from Messages where user_id in (sender, recipient) and id > last_seen_id). The benefit of exists is that as soon as it finds 1 record it returns true.



      Edit: To avoid any confusion in reading this answer, both of those queries would also include a check for other_user_id in (sender, recipient) in order to only return the messages between 2 specific users.






      share|improve this answer





























        0














        If you need to know the number of new messages then using
        Select count(*) from Messages where user_id in (sender, recipient) and id > last_seen_id would be your best option.



        I'm a fan of using exists where possible, so to determine IF there are new messages, my query would be Select exists(Select 1 from Messages where user_id in (sender, recipient) and id > last_seen_id). The benefit of exists is that as soon as it finds 1 record it returns true.



        Edit: To avoid any confusion in reading this answer, both of those queries would also include a check for other_user_id in (sender, recipient) in order to only return the messages between 2 specific users.






        share|improve this answer



























          0












          0








          0







          If you need to know the number of new messages then using
          Select count(*) from Messages where user_id in (sender, recipient) and id > last_seen_id would be your best option.



          I'm a fan of using exists where possible, so to determine IF there are new messages, my query would be Select exists(Select 1 from Messages where user_id in (sender, recipient) and id > last_seen_id). The benefit of exists is that as soon as it finds 1 record it returns true.



          Edit: To avoid any confusion in reading this answer, both of those queries would also include a check for other_user_id in (sender, recipient) in order to only return the messages between 2 specific users.






          share|improve this answer















          If you need to know the number of new messages then using
          Select count(*) from Messages where user_id in (sender, recipient) and id > last_seen_id would be your best option.



          I'm a fan of using exists where possible, so to determine IF there are new messages, my query would be Select exists(Select 1 from Messages where user_id in (sender, recipient) and id > last_seen_id). The benefit of exists is that as soon as it finds 1 record it returns true.



          Edit: To avoid any confusion in reading this answer, both of those queries would also include a check for other_user_id in (sender, recipient) in order to only return the messages between 2 specific users.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 2 hours ago

























          answered 2 hours ago









          AaronAaron

          417




          417





















              0














              @FeHora You talk about not using keys to save db space. The table designs wastes more db space.



              ID - Auto Increment, Primary Key (Bigint)


              Is bigint really necessary? Let us assume, the a message is send every second. The a int unsigned is enough for 126 years. And if you have really so much messages, a key is mandatory.



              Sender - Varchar (32) // Foreign Key to UserID hash from Users DB Table
              Recipient - Varchar (32) // Foreign Key to UserID hash from Users DB Table


              Why not using the UserID (usually an int unsigned).



              Then I would add a seen flags. Btw, you can add for all filed the attribute not null.



              seen tinyint not NULL.


              Last not least I recomment the variant of @Mjh : Define a flag has_messages, or new_messages, or both in the user record. Usually, the user record is loaded so it is NOT an additional database query.





              share



























                0














                @FeHora You talk about not using keys to save db space. The table designs wastes more db space.



                ID - Auto Increment, Primary Key (Bigint)


                Is bigint really necessary? Let us assume, the a message is send every second. The a int unsigned is enough for 126 years. And if you have really so much messages, a key is mandatory.



                Sender - Varchar (32) // Foreign Key to UserID hash from Users DB Table
                Recipient - Varchar (32) // Foreign Key to UserID hash from Users DB Table


                Why not using the UserID (usually an int unsigned).



                Then I would add a seen flags. Btw, you can add for all filed the attribute not null.



                seen tinyint not NULL.


                Last not least I recomment the variant of @Mjh : Define a flag has_messages, or new_messages, or both in the user record. Usually, the user record is loaded so it is NOT an additional database query.





                share

























                  0












                  0








                  0







                  @FeHora You talk about not using keys to save db space. The table designs wastes more db space.



                  ID - Auto Increment, Primary Key (Bigint)


                  Is bigint really necessary? Let us assume, the a message is send every second. The a int unsigned is enough for 126 years. And if you have really so much messages, a key is mandatory.



                  Sender - Varchar (32) // Foreign Key to UserID hash from Users DB Table
                  Recipient - Varchar (32) // Foreign Key to UserID hash from Users DB Table


                  Why not using the UserID (usually an int unsigned).



                  Then I would add a seen flags. Btw, you can add for all filed the attribute not null.



                  seen tinyint not NULL.


                  Last not least I recomment the variant of @Mjh : Define a flag has_messages, or new_messages, or both in the user record. Usually, the user record is loaded so it is NOT an additional database query.





                  share













                  @FeHora You talk about not using keys to save db space. The table designs wastes more db space.



                  ID - Auto Increment, Primary Key (Bigint)


                  Is bigint really necessary? Let us assume, the a message is send every second. The a int unsigned is enough for 126 years. And if you have really so much messages, a key is mandatory.



                  Sender - Varchar (32) // Foreign Key to UserID hash from Users DB Table
                  Recipient - Varchar (32) // Foreign Key to UserID hash from Users DB Table


                  Why not using the UserID (usually an int unsigned).



                  Then I would add a seen flags. Btw, you can add for all filed the attribute not null.



                  seen tinyint not NULL.


                  Last not least I recomment the variant of @Mjh : Define a flag has_messages, or new_messages, or both in the user record. Usually, the user record is loaded so it is NOT an additional database query.






                  share











                  share


                  share










                  answered 45 secs ago









                  WiimmWiimm

                  955516




                  955516



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55581114%2fcount-or-maxid-which-is-faster%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      How to create a command for the “strange m” symbol in latex? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)How do you make your own symbol when Detexify fails?Writing bold small caps with mathpazo packageplus-minus symbol with parenthesis around the minus signGreek character in Beamer document titleHow to create dashed right arrow over symbol?Currency symbol: Turkish LiraDouble prec as a single symbol?Plus Sign Too Big; How to Call adfbullet?Is there a TeX macro for three-legged pi?How do I get my integral-like symbol to align like the integral?How to selectively substitute a letter with another symbol representing the same letterHow do I generate a less than symbol and vertical bar that are the same height?

                      Българска екзархия Съдържание История | Български екзарси | Вижте също | Външни препратки | Литература | Бележки | НавигацияУстав за управлението на българската екзархия. Цариград, 1870Слово на Ловешкия митрополит Иларион при откриването на Българския народен събор в Цариград на 23. II. 1870 г.Българската правда и гръцката кривда. От С. М. (= Софийски Мелетий). Цариград, 1872Предстоятели на Българската екзархияПодмененият ВеликденИнформационна агенция „Фокус“Димитър Ризов. Българите в техните исторически, етнографически и политически граници (Атлас съдържащ 40 карти). Berlin, Königliche Hoflithographie, Hof-Buch- und -Steindruckerei Wilhelm Greve, 1917Report of the International Commission to Inquire into the Causes and Conduct of the Balkan Wars

                      Чепеларе Съдържание География | История | Население | Спортни и природни забележителности | Културни и исторически обекти | Религии | Обществени институции | Известни личности | Редовни събития | Галерия | Източници | Литература | Външни препратки | Навигация41°43′23.99″ с. ш. 24°41′09.99″ и. д. / 41.723333° с. ш. 24.686111° и. д.*ЧепелареЧепеларски Linux fest 2002Начало на Зимен сезон 2005/06Национални хайдушки празници „Капитан Петко Войвода“Град ЧепелареЧепеларе – народният ски курортbgrod.orgwww.terranatura.hit.bgСправка за населението на гр. Исперих, общ. Исперих, обл. РазградМузей на родопския карстМузей на спорта и скитеЧепеларебългарскибългарскианглийскитукИстория на градаСки писти в ЧепелареВремето в ЧепелареРадио и телевизия в ЧепелареЧепеларе мами с родопски чар и добри пистиЕвтин туризъм и снежни атракции в ЧепелареМестоположениеИнформация и снимки от музея на родопския карст3D панорами от ЧепелареЧепелареррр