Validation accuracy vs Testing accuracyInformation on how value of k in k-fold cross-validation affects resulting accuraciesEstimating the variance of a bootstrap aggregator performance?Inconsistency in cross-validation resultsCross-validation including training, validation, and testing. Why do we need three subsets?My Test accuracy is pretty bad compared to cross-validation accuracyBetter accuracy with validation set than test setFeature selection: is nested cross-validation needed?10-fold cross validation, why having a validation set?Bias-Variance terminology for loss functions in ML vs cross-validation — different things?Is cross-validation better/worse than a third holdout set?

What is the logic behind how bash tests for true/false?

"which" command doesn't work / path of Safari?

I probably found a bug with the sudo apt install function

Is it tax fraud for an individual to declare non-taxable revenue as taxable income? (US tax laws)

Download, install and reboot computer at night if needed

How is it possible for user's password to be changed after storage was encrypted? (on OS X, Android)

DOS, create pipe for stdin/stdout of command.com(or 4dos.com) in C or Batch?

What is the command to reset a PC without deleting any files

Compute hash value according to multiplication method

Is it possible to make sharp wind that can cut stuff from afar?

Why don't electromagnetic waves interact with each other?

How can I hide my bitcoin transactions to protect anonymity from others?

How do I create uniquely male characters?

What would happen to a modern skyscraper if it rains micro blackholes?

How to type dʒ symbol (IPA) on Mac?

Is there really no realistic way for a skeleton monster to move around without magic?

Work Breakdown with Tikz

What Brexit solution does the DUP want?

New order #4: World

Can a German sentence have two subjects?

Set-theoretical foundations of Mathematics with only bounded quantifiers

declaring a variable twice in IIFE

How can I fix this gap between bookcases I made?

Why is the design of haulage companies so “special”?



Validation accuracy vs Testing accuracy


Information on how value of k in k-fold cross-validation affects resulting accuraciesEstimating the variance of a bootstrap aggregator performance?Inconsistency in cross-validation resultsCross-validation including training, validation, and testing. Why do we need three subsets?My Test accuracy is pretty bad compared to cross-validation accuracyBetter accuracy with validation set than test setFeature selection: is nested cross-validation needed?10-fold cross validation, why having a validation set?Bias-Variance terminology for loss functions in ML vs cross-validation — different things?Is cross-validation better/worse than a third holdout set?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








2












$begingroup$


I am trying to get my head straight on terminology which appears confusing. I know there are three 'splits' of data used in Machine learning models.:



  1. Training Data - Train the model

  2. Validation Data - Cross validation for model selection

  3. Testing Data - Test the generalisation error.

Now, as far as I am aware, the validation data is not always used as one can use k-fold cross-validation, reducing the need to further reduce ones dataset. The results of which are known as the validation accuracy. Then once the best model is selected, the model is tested on a 33% split from the initial data set (which has not been used to train). The results of this would be the testing accuracy?



Is this the right way around? or is vice versa? I am finding conflicting terminology used online! I am trying to find some explanations why my validation error is larger than my testing error, but before I find a solution, i would like to get my terminology correct.



Thanks.










share|cite|improve this question









$endgroup$


















    2












    $begingroup$


    I am trying to get my head straight on terminology which appears confusing. I know there are three 'splits' of data used in Machine learning models.:



    1. Training Data - Train the model

    2. Validation Data - Cross validation for model selection

    3. Testing Data - Test the generalisation error.

    Now, as far as I am aware, the validation data is not always used as one can use k-fold cross-validation, reducing the need to further reduce ones dataset. The results of which are known as the validation accuracy. Then once the best model is selected, the model is tested on a 33% split from the initial data set (which has not been used to train). The results of this would be the testing accuracy?



    Is this the right way around? or is vice versa? I am finding conflicting terminology used online! I am trying to find some explanations why my validation error is larger than my testing error, but before I find a solution, i would like to get my terminology correct.



    Thanks.










    share|cite|improve this question









    $endgroup$














      2












      2








      2


      1



      $begingroup$


      I am trying to get my head straight on terminology which appears confusing. I know there are three 'splits' of data used in Machine learning models.:



      1. Training Data - Train the model

      2. Validation Data - Cross validation for model selection

      3. Testing Data - Test the generalisation error.

      Now, as far as I am aware, the validation data is not always used as one can use k-fold cross-validation, reducing the need to further reduce ones dataset. The results of which are known as the validation accuracy. Then once the best model is selected, the model is tested on a 33% split from the initial data set (which has not been used to train). The results of this would be the testing accuracy?



      Is this the right way around? or is vice versa? I am finding conflicting terminology used online! I am trying to find some explanations why my validation error is larger than my testing error, but before I find a solution, i would like to get my terminology correct.



      Thanks.










      share|cite|improve this question









      $endgroup$




      I am trying to get my head straight on terminology which appears confusing. I know there are three 'splits' of data used in Machine learning models.:



      1. Training Data - Train the model

      2. Validation Data - Cross validation for model selection

      3. Testing Data - Test the generalisation error.

      Now, as far as I am aware, the validation data is not always used as one can use k-fold cross-validation, reducing the need to further reduce ones dataset. The results of which are known as the validation accuracy. Then once the best model is selected, the model is tested on a 33% split from the initial data set (which has not been used to train). The results of this would be the testing accuracy?



      Is this the right way around? or is vice versa? I am finding conflicting terminology used online! I am trying to find some explanations why my validation error is larger than my testing error, but before I find a solution, i would like to get my terminology correct.



      Thanks.







      machine-learning






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked 6 hours ago









      BillyJo_ramblerBillyJo_rambler

      296




      296




















          2 Answers
          2






          active

          oldest

          votes


















          1












          $begingroup$

          There isn't a standard terminology in this context (and I have seen long discussions and debates regarding this topic), so I completely understand you, but you should get used to different terminology (and assume that terminology might not be consistent or it change across sources).



          I would like to point out a few things:



          • I have never seen people use the expression "validation accuracy" (or dataset) to refer to the test accuracy (or dataset), but I have seen people use the term "test accuracy" (or dataset) to refer to the validation accuracy (or dataset). In other words, the test (or testing) accuracy often refers to the validation accuracy, that is, the accuracy you calculate on the data set you do not use for training, but you use (during the training process) for validating (or "testing") the generalisation ability of your model or for "early stopping".


          • In k-fold cross-validation, people usually only mention two datasets: training and testing (or validation).


          • k-fold cross-validation is just a way of validating the model on different subsets of the data. This can be done for several reasons. For example, you have a small amount of data, so your validation (and training) dataset is quite small, so you want to have a better understanding of the model's generalisation ability by validating it on several subsets of the whole dataset.


          • You should likely have a separate (from the validation dataset) dataset for testing, because the validation dataset can be used for early stopping, so, in a certain way, it is dependent on the training process


          I would suggest to use the following terminology



          • Training dataset: the data used to fit the model.

          • Validation dataset: the data used to validate the generalisation ability of the model or for early stopping, during the training process.

          • Testing dataset: the data used to for other purposes other than training and validating.

          Note that some of these datasets might overlap, but this might almost never be a good thing (if you have enough data).






          share|cite|improve this answer











          $endgroup$












          • $begingroup$
            If the testing dataset overlaps with either of the others, it is definitely not a good thing. The test accuracy must measure performance on unseen data. If any part of training saw the data, then it isn't test data, and representing it as such is dishonest. Allowing the validation set to overlap with the training set isn't dishonest, but it probably won't accomplish its task as well. (e.g., if you're doing early stopping, and your validation set and training sets overlap, overfitting may occur and not be detected.)
            $endgroup$
            – Ray
            1 hour ago











          • $begingroup$
            @Ray I didn't say it is a good thing. Indeed, see my point "You should likely have a separate (from the validation dataset) dataset for testing...".
            $endgroup$
            – nbro
            1 hour ago











          • $begingroup$
            You said "If that's a 'good' thing or not, it's another question." I suspected from the rest that you understood the problems that that overlap would cause, but the problems with that should be made very clear, since contaminating your test data with training samples completely ruins its value.
            $endgroup$
            – Ray
            1 hour ago











          • $begingroup$
            @Ray I wanted more to refer to the overlap between the training and validation datasets. Anyway, I think it's good that you wanted to clarify or emphasise this point. I edited my answer to emphasise this point.
            $endgroup$
            – nbro
            59 mins ago



















          1












          $begingroup$

          @nbro's answer is complete. I just add a couple of explanations to supplement. In more traditional textbooks data is often partitioned into two sets: training and test. In recent years, with more complex models and increasing need for model selection, development sets or validations sets are also considered. Devel/validation should have no overlap with the test set or the reporting accuracy/ error evaluation is not valid. In the modern setting: the model is trained on the training set, tested on the validation set to see if it is a good fit, possibly model is tweaked and trained again and validated again for multiple times. When the final model is selected, the testing set is used to calculate accuracy, error reports. The important thing is that the test set is only touched once.






          share|cite|improve this answer








          New contributor




          user3089485 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$













            Your Answer





            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "65"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f401696%2fvalidation-accuracy-vs-testing-accuracy%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1












            $begingroup$

            There isn't a standard terminology in this context (and I have seen long discussions and debates regarding this topic), so I completely understand you, but you should get used to different terminology (and assume that terminology might not be consistent or it change across sources).



            I would like to point out a few things:



            • I have never seen people use the expression "validation accuracy" (or dataset) to refer to the test accuracy (or dataset), but I have seen people use the term "test accuracy" (or dataset) to refer to the validation accuracy (or dataset). In other words, the test (or testing) accuracy often refers to the validation accuracy, that is, the accuracy you calculate on the data set you do not use for training, but you use (during the training process) for validating (or "testing") the generalisation ability of your model or for "early stopping".


            • In k-fold cross-validation, people usually only mention two datasets: training and testing (or validation).


            • k-fold cross-validation is just a way of validating the model on different subsets of the data. This can be done for several reasons. For example, you have a small amount of data, so your validation (and training) dataset is quite small, so you want to have a better understanding of the model's generalisation ability by validating it on several subsets of the whole dataset.


            • You should likely have a separate (from the validation dataset) dataset for testing, because the validation dataset can be used for early stopping, so, in a certain way, it is dependent on the training process


            I would suggest to use the following terminology



            • Training dataset: the data used to fit the model.

            • Validation dataset: the data used to validate the generalisation ability of the model or for early stopping, during the training process.

            • Testing dataset: the data used to for other purposes other than training and validating.

            Note that some of these datasets might overlap, but this might almost never be a good thing (if you have enough data).






            share|cite|improve this answer











            $endgroup$












            • $begingroup$
              If the testing dataset overlaps with either of the others, it is definitely not a good thing. The test accuracy must measure performance on unseen data. If any part of training saw the data, then it isn't test data, and representing it as such is dishonest. Allowing the validation set to overlap with the training set isn't dishonest, but it probably won't accomplish its task as well. (e.g., if you're doing early stopping, and your validation set and training sets overlap, overfitting may occur and not be detected.)
              $endgroup$
              – Ray
              1 hour ago











            • $begingroup$
              @Ray I didn't say it is a good thing. Indeed, see my point "You should likely have a separate (from the validation dataset) dataset for testing...".
              $endgroup$
              – nbro
              1 hour ago











            • $begingroup$
              You said "If that's a 'good' thing or not, it's another question." I suspected from the rest that you understood the problems that that overlap would cause, but the problems with that should be made very clear, since contaminating your test data with training samples completely ruins its value.
              $endgroup$
              – Ray
              1 hour ago











            • $begingroup$
              @Ray I wanted more to refer to the overlap between the training and validation datasets. Anyway, I think it's good that you wanted to clarify or emphasise this point. I edited my answer to emphasise this point.
              $endgroup$
              – nbro
              59 mins ago
















            1












            $begingroup$

            There isn't a standard terminology in this context (and I have seen long discussions and debates regarding this topic), so I completely understand you, but you should get used to different terminology (and assume that terminology might not be consistent or it change across sources).



            I would like to point out a few things:



            • I have never seen people use the expression "validation accuracy" (or dataset) to refer to the test accuracy (or dataset), but I have seen people use the term "test accuracy" (or dataset) to refer to the validation accuracy (or dataset). In other words, the test (or testing) accuracy often refers to the validation accuracy, that is, the accuracy you calculate on the data set you do not use for training, but you use (during the training process) for validating (or "testing") the generalisation ability of your model or for "early stopping".


            • In k-fold cross-validation, people usually only mention two datasets: training and testing (or validation).


            • k-fold cross-validation is just a way of validating the model on different subsets of the data. This can be done for several reasons. For example, you have a small amount of data, so your validation (and training) dataset is quite small, so you want to have a better understanding of the model's generalisation ability by validating it on several subsets of the whole dataset.


            • You should likely have a separate (from the validation dataset) dataset for testing, because the validation dataset can be used for early stopping, so, in a certain way, it is dependent on the training process


            I would suggest to use the following terminology



            • Training dataset: the data used to fit the model.

            • Validation dataset: the data used to validate the generalisation ability of the model or for early stopping, during the training process.

            • Testing dataset: the data used to for other purposes other than training and validating.

            Note that some of these datasets might overlap, but this might almost never be a good thing (if you have enough data).






            share|cite|improve this answer











            $endgroup$












            • $begingroup$
              If the testing dataset overlaps with either of the others, it is definitely not a good thing. The test accuracy must measure performance on unseen data. If any part of training saw the data, then it isn't test data, and representing it as such is dishonest. Allowing the validation set to overlap with the training set isn't dishonest, but it probably won't accomplish its task as well. (e.g., if you're doing early stopping, and your validation set and training sets overlap, overfitting may occur and not be detected.)
              $endgroup$
              – Ray
              1 hour ago











            • $begingroup$
              @Ray I didn't say it is a good thing. Indeed, see my point "You should likely have a separate (from the validation dataset) dataset for testing...".
              $endgroup$
              – nbro
              1 hour ago











            • $begingroup$
              You said "If that's a 'good' thing or not, it's another question." I suspected from the rest that you understood the problems that that overlap would cause, but the problems with that should be made very clear, since contaminating your test data with training samples completely ruins its value.
              $endgroup$
              – Ray
              1 hour ago











            • $begingroup$
              @Ray I wanted more to refer to the overlap between the training and validation datasets. Anyway, I think it's good that you wanted to clarify or emphasise this point. I edited my answer to emphasise this point.
              $endgroup$
              – nbro
              59 mins ago














            1












            1








            1





            $begingroup$

            There isn't a standard terminology in this context (and I have seen long discussions and debates regarding this topic), so I completely understand you, but you should get used to different terminology (and assume that terminology might not be consistent or it change across sources).



            I would like to point out a few things:



            • I have never seen people use the expression "validation accuracy" (or dataset) to refer to the test accuracy (or dataset), but I have seen people use the term "test accuracy" (or dataset) to refer to the validation accuracy (or dataset). In other words, the test (or testing) accuracy often refers to the validation accuracy, that is, the accuracy you calculate on the data set you do not use for training, but you use (during the training process) for validating (or "testing") the generalisation ability of your model or for "early stopping".


            • In k-fold cross-validation, people usually only mention two datasets: training and testing (or validation).


            • k-fold cross-validation is just a way of validating the model on different subsets of the data. This can be done for several reasons. For example, you have a small amount of data, so your validation (and training) dataset is quite small, so you want to have a better understanding of the model's generalisation ability by validating it on several subsets of the whole dataset.


            • You should likely have a separate (from the validation dataset) dataset for testing, because the validation dataset can be used for early stopping, so, in a certain way, it is dependent on the training process


            I would suggest to use the following terminology



            • Training dataset: the data used to fit the model.

            • Validation dataset: the data used to validate the generalisation ability of the model or for early stopping, during the training process.

            • Testing dataset: the data used to for other purposes other than training and validating.

            Note that some of these datasets might overlap, but this might almost never be a good thing (if you have enough data).






            share|cite|improve this answer











            $endgroup$



            There isn't a standard terminology in this context (and I have seen long discussions and debates regarding this topic), so I completely understand you, but you should get used to different terminology (and assume that terminology might not be consistent or it change across sources).



            I would like to point out a few things:



            • I have never seen people use the expression "validation accuracy" (or dataset) to refer to the test accuracy (or dataset), but I have seen people use the term "test accuracy" (or dataset) to refer to the validation accuracy (or dataset). In other words, the test (or testing) accuracy often refers to the validation accuracy, that is, the accuracy you calculate on the data set you do not use for training, but you use (during the training process) for validating (or "testing") the generalisation ability of your model or for "early stopping".


            • In k-fold cross-validation, people usually only mention two datasets: training and testing (or validation).


            • k-fold cross-validation is just a way of validating the model on different subsets of the data. This can be done for several reasons. For example, you have a small amount of data, so your validation (and training) dataset is quite small, so you want to have a better understanding of the model's generalisation ability by validating it on several subsets of the whole dataset.


            • You should likely have a separate (from the validation dataset) dataset for testing, because the validation dataset can be used for early stopping, so, in a certain way, it is dependent on the training process


            I would suggest to use the following terminology



            • Training dataset: the data used to fit the model.

            • Validation dataset: the data used to validate the generalisation ability of the model or for early stopping, during the training process.

            • Testing dataset: the data used to for other purposes other than training and validating.

            Note that some of these datasets might overlap, but this might almost never be a good thing (if you have enough data).







            share|cite|improve this answer














            share|cite|improve this answer



            share|cite|improve this answer








            edited 58 mins ago

























            answered 5 hours ago









            nbronbro

            8111023




            8111023











            • $begingroup$
              If the testing dataset overlaps with either of the others, it is definitely not a good thing. The test accuracy must measure performance on unseen data. If any part of training saw the data, then it isn't test data, and representing it as such is dishonest. Allowing the validation set to overlap with the training set isn't dishonest, but it probably won't accomplish its task as well. (e.g., if you're doing early stopping, and your validation set and training sets overlap, overfitting may occur and not be detected.)
              $endgroup$
              – Ray
              1 hour ago











            • $begingroup$
              @Ray I didn't say it is a good thing. Indeed, see my point "You should likely have a separate (from the validation dataset) dataset for testing...".
              $endgroup$
              – nbro
              1 hour ago











            • $begingroup$
              You said "If that's a 'good' thing or not, it's another question." I suspected from the rest that you understood the problems that that overlap would cause, but the problems with that should be made very clear, since contaminating your test data with training samples completely ruins its value.
              $endgroup$
              – Ray
              1 hour ago











            • $begingroup$
              @Ray I wanted more to refer to the overlap between the training and validation datasets. Anyway, I think it's good that you wanted to clarify or emphasise this point. I edited my answer to emphasise this point.
              $endgroup$
              – nbro
              59 mins ago

















            • $begingroup$
              If the testing dataset overlaps with either of the others, it is definitely not a good thing. The test accuracy must measure performance on unseen data. If any part of training saw the data, then it isn't test data, and representing it as such is dishonest. Allowing the validation set to overlap with the training set isn't dishonest, but it probably won't accomplish its task as well. (e.g., if you're doing early stopping, and your validation set and training sets overlap, overfitting may occur and not be detected.)
              $endgroup$
              – Ray
              1 hour ago











            • $begingroup$
              @Ray I didn't say it is a good thing. Indeed, see my point "You should likely have a separate (from the validation dataset) dataset for testing...".
              $endgroup$
              – nbro
              1 hour ago











            • $begingroup$
              You said "If that's a 'good' thing or not, it's another question." I suspected from the rest that you understood the problems that that overlap would cause, but the problems with that should be made very clear, since contaminating your test data with training samples completely ruins its value.
              $endgroup$
              – Ray
              1 hour ago











            • $begingroup$
              @Ray I wanted more to refer to the overlap between the training and validation datasets. Anyway, I think it's good that you wanted to clarify or emphasise this point. I edited my answer to emphasise this point.
              $endgroup$
              – nbro
              59 mins ago
















            $begingroup$
            If the testing dataset overlaps with either of the others, it is definitely not a good thing. The test accuracy must measure performance on unseen data. If any part of training saw the data, then it isn't test data, and representing it as such is dishonest. Allowing the validation set to overlap with the training set isn't dishonest, but it probably won't accomplish its task as well. (e.g., if you're doing early stopping, and your validation set and training sets overlap, overfitting may occur and not be detected.)
            $endgroup$
            – Ray
            1 hour ago





            $begingroup$
            If the testing dataset overlaps with either of the others, it is definitely not a good thing. The test accuracy must measure performance on unseen data. If any part of training saw the data, then it isn't test data, and representing it as such is dishonest. Allowing the validation set to overlap with the training set isn't dishonest, but it probably won't accomplish its task as well. (e.g., if you're doing early stopping, and your validation set and training sets overlap, overfitting may occur and not be detected.)
            $endgroup$
            – Ray
            1 hour ago













            $begingroup$
            @Ray I didn't say it is a good thing. Indeed, see my point "You should likely have a separate (from the validation dataset) dataset for testing...".
            $endgroup$
            – nbro
            1 hour ago





            $begingroup$
            @Ray I didn't say it is a good thing. Indeed, see my point "You should likely have a separate (from the validation dataset) dataset for testing...".
            $endgroup$
            – nbro
            1 hour ago













            $begingroup$
            You said "If that's a 'good' thing or not, it's another question." I suspected from the rest that you understood the problems that that overlap would cause, but the problems with that should be made very clear, since contaminating your test data with training samples completely ruins its value.
            $endgroup$
            – Ray
            1 hour ago





            $begingroup$
            You said "If that's a 'good' thing or not, it's another question." I suspected from the rest that you understood the problems that that overlap would cause, but the problems with that should be made very clear, since contaminating your test data with training samples completely ruins its value.
            $endgroup$
            – Ray
            1 hour ago













            $begingroup$
            @Ray I wanted more to refer to the overlap between the training and validation datasets. Anyway, I think it's good that you wanted to clarify or emphasise this point. I edited my answer to emphasise this point.
            $endgroup$
            – nbro
            59 mins ago





            $begingroup$
            @Ray I wanted more to refer to the overlap between the training and validation datasets. Anyway, I think it's good that you wanted to clarify or emphasise this point. I edited my answer to emphasise this point.
            $endgroup$
            – nbro
            59 mins ago














            1












            $begingroup$

            @nbro's answer is complete. I just add a couple of explanations to supplement. In more traditional textbooks data is often partitioned into two sets: training and test. In recent years, with more complex models and increasing need for model selection, development sets or validations sets are also considered. Devel/validation should have no overlap with the test set or the reporting accuracy/ error evaluation is not valid. In the modern setting: the model is trained on the training set, tested on the validation set to see if it is a good fit, possibly model is tweaked and trained again and validated again for multiple times. When the final model is selected, the testing set is used to calculate accuracy, error reports. The important thing is that the test set is only touched once.






            share|cite|improve this answer








            New contributor




            user3089485 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$

















              1












              $begingroup$

              @nbro's answer is complete. I just add a couple of explanations to supplement. In more traditional textbooks data is often partitioned into two sets: training and test. In recent years, with more complex models and increasing need for model selection, development sets or validations sets are also considered. Devel/validation should have no overlap with the test set or the reporting accuracy/ error evaluation is not valid. In the modern setting: the model is trained on the training set, tested on the validation set to see if it is a good fit, possibly model is tweaked and trained again and validated again for multiple times. When the final model is selected, the testing set is used to calculate accuracy, error reports. The important thing is that the test set is only touched once.






              share|cite|improve this answer








              New contributor




              user3089485 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              $endgroup$















                1












                1








                1





                $begingroup$

                @nbro's answer is complete. I just add a couple of explanations to supplement. In more traditional textbooks data is often partitioned into two sets: training and test. In recent years, with more complex models and increasing need for model selection, development sets or validations sets are also considered. Devel/validation should have no overlap with the test set or the reporting accuracy/ error evaluation is not valid. In the modern setting: the model is trained on the training set, tested on the validation set to see if it is a good fit, possibly model is tweaked and trained again and validated again for multiple times. When the final model is selected, the testing set is used to calculate accuracy, error reports. The important thing is that the test set is only touched once.






                share|cite|improve this answer








                New contributor




                user3089485 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.






                $endgroup$



                @nbro's answer is complete. I just add a couple of explanations to supplement. In more traditional textbooks data is often partitioned into two sets: training and test. In recent years, with more complex models and increasing need for model selection, development sets or validations sets are also considered. Devel/validation should have no overlap with the test set or the reporting accuracy/ error evaluation is not valid. In the modern setting: the model is trained on the training set, tested on the validation set to see if it is a good fit, possibly model is tweaked and trained again and validated again for multiple times. When the final model is selected, the testing set is used to calculate accuracy, error reports. The important thing is that the test set is only touched once.







                share|cite|improve this answer








                New contributor




                user3089485 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.









                share|cite|improve this answer



                share|cite|improve this answer






                New contributor




                user3089485 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.









                answered 2 hours ago









                user3089485user3089485

                162




                162




                New contributor




                user3089485 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.





                New contributor





                user3089485 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.






                user3089485 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Cross Validated!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f401696%2fvalidation-accuracy-vs-testing-accuracy%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    How to create a command for the “strange m” symbol in latex? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)How do you make your own symbol when Detexify fails?Writing bold small caps with mathpazo packageplus-minus symbol with parenthesis around the minus signGreek character in Beamer document titleHow to create dashed right arrow over symbol?Currency symbol: Turkish LiraDouble prec as a single symbol?Plus Sign Too Big; How to Call adfbullet?Is there a TeX macro for three-legged pi?How do I get my integral-like symbol to align like the integral?How to selectively substitute a letter with another symbol representing the same letterHow do I generate a less than symbol and vertical bar that are the same height?

                    Българска екзархия Съдържание История | Български екзарси | Вижте също | Външни препратки | Литература | Бележки | НавигацияУстав за управлението на българската екзархия. Цариград, 1870Слово на Ловешкия митрополит Иларион при откриването на Българския народен събор в Цариград на 23. II. 1870 г.Българската правда и гръцката кривда. От С. М. (= Софийски Мелетий). Цариград, 1872Предстоятели на Българската екзархияПодмененият ВеликденИнформационна агенция „Фокус“Димитър Ризов. Българите в техните исторически, етнографически и политически граници (Атлас съдържащ 40 карти). Berlin, Königliche Hoflithographie, Hof-Buch- und -Steindruckerei Wilhelm Greve, 1917Report of the International Commission to Inquire into the Causes and Conduct of the Balkan Wars

                    Чепеларе Съдържание География | История | Население | Спортни и природни забележителности | Културни и исторически обекти | Религии | Обществени институции | Известни личности | Редовни събития | Галерия | Източници | Литература | Външни препратки | Навигация41°43′23.99″ с. ш. 24°41′09.99″ и. д. / 41.723333° с. ш. 24.686111° и. д.*ЧепелареЧепеларски Linux fest 2002Начало на Зимен сезон 2005/06Национални хайдушки празници „Капитан Петко Войвода“Град ЧепелареЧепеларе – народният ски курортbgrod.orgwww.terranatura.hit.bgСправка за населението на гр. Исперих, общ. Исперих, обл. РазградМузей на родопския карстМузей на спорта и скитеЧепеларебългарскибългарскианглийскитукИстория на градаСки писти в ЧепелареВремето в ЧепелареРадио и телевизия в ЧепелареЧепеларе мами с родопски чар и добри пистиЕвтин туризъм и снежни атракции в ЧепелареМестоположениеИнформация и снимки от музея на родопския карст3D панорами от ЧепелареЧепелареррр