ROCSS



Stata help for rocss

-------------------------------------------------------------------------------------------------------------------------------------------
help for rocss
-------------------------------------------------------------------------------------------------------------------------------------------

ROC curve and other statistics for any classification method

        rocss dep_var prob_var [if exp] [in range] [, ncut(#) savedata(filename) graph replace]

 Description

    rocss calculates sensitivity, specificity, cumulative area under the ROC curve and percentage of subjects correctly classified at
    user-specified probability cutoffs.

    dep_var is the binary outcome variable coded 0, 1.

    prob_var contains the estimated probabilities that dep_var==1.

    An example with four observations:

        id      dep_var    prob_var
           1         0          0.2
           2         1          0.8
           3         1          0.9
           4         0          0.3

 Remarks

    Unlike lsens, rocss is not a post-estimation command and allows the user to set arbitrary probability cutoffs.  However, if used on
    predicted probabilities derived by logistic, logit or probit, rocss represents a flexible alternative to lsens.

 Options

     ncut(#) specifies the number equally spaced probability intervals in the range 0, 1. The number of corresponding probability cutoffs
        will be (# + 1), at values 0, 1/#, 2/#, ..., 1. The default is 10 equally spaced intervals.

     savedata(filename) specifies the name of a new dataset created to contain the probability cutoffs and corresponding sensitivity,
        specificity, cumulative area under the ROC curve and the percentage of subjects correctly classified.  The dataset is saved in the
        current directory.

     graph graphs sensitivity versus 1-specificity (help for lroc) calculated at each probability cutoff.

     replace requests that if the dataset specified in savedata(filename) already exists, it should be overwritten.

 Examples

    . webuse lbw, clear
    . logistic low age lwt smoke ptl ht ui
    . lstat
    . lroc, nograph
    . lstat, cutoff(0.30)
    . predict p

    . rocss low p  // compare the results
    . rocss low p, ncut(20) gr
    . rocss low p, saved(allsens)
    . rocss low p, ncut(80) gr saved(allsens) rep 
 
 Authors

    Nicola Orsini, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden and Institute of Information Science and
    Technology, National Research Council of Italy, Pisa, Italy.

    Matteo Bottai, Arnold School of Public Health, University of South Carolina, Columbia, USA and Institute of Information Science and
    Technology, National Research Council of Italy, Pisa, Italy.

 Also see

           [R] logistic

On-line:  help for help for lroc, lstat, lsens, roc

Worked examples for rocss

Click here to run or save the do-file for the following worked examples and be sure to have an update version. Type
. net install http://nicolaorsini.altervista.org/stata/rocss  
. which rocss    
c:\ado\plus\r\rocss.ado
*!  Version 1.0 - March 11, 2004 - N.Orsini

. webuse lbw, clear   
(Hosmer & Lemeshow data)

. logistic low age lwt smoke ptl ht ui     

Logistic regression                               Number of obs   =        189
                                                  LR chi2(6)      =      25.88
                                                  Prob > chi2     =     0.0002
Log likelihood = -104.39591                       Pseudo R2       =     0.1103

------------------------------------------------------------------------------
         low | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .9586258   .0331527    -1.22   0.222      .895801    1.025857
         lwt |   .9858131   .0065579    -2.15   0.032     .9730433    .9987505
       smoke |   1.734347   .5959725     1.60   0.109     .8843789    3.401213
         ptl |    1.80987    .630593     1.70   0.089     .9142657    3.582796
          ht |   6.439757   4.419149     2.71   0.007     1.677839     24.7166
          ui |   2.089219   .9537039     1.61   0.107     .8539267    5.111489
------------------------------------------------------------------------------

. lstat   

Logistic model for low

              -------- True --------
Classified |         D            ~D  |      Total
-----------+--------------------------+-----------
     +     |        16            13  |         29
     -     |        43           117  |        160
-----------+--------------------------+-----------
   Total   |        59           130  |        189

Classified + if predicted Pr(D) >= .5
True D defined as low != 0
--------------------------------------------------
Sensitivity                     Pr( +| D)   27.12%
Specificity                     Pr( -|~D)   90.00%
Positive predictive value       Pr( D| +)   55.17%
Negative predictive value       Pr(~D| -)   73.13%
--------------------------------------------------
False + rate for true ~D        Pr( +|~D)   10.00%
False - rate for true D         Pr( -| D)   72.88%
False + rate for classified +   Pr(~D| +)   44.83%
False - rate for classified -   Pr( D| -)   26.88%
--------------------------------------------------
Correctly classified                        70.37%
--------------------------------------------------

. lroc, nograph  

Logistic model for low

number of observations =      189
area under ROC curve   =   0.7373

. lstat, cutoff(0.30)  

Logistic model for low

              -------- True --------
Classified |         D            ~D  |      Total
-----------+--------------------------+-----------
     +     |        38            42  |         80
     -     |        21            88  |        109
-----------+--------------------------+-----------
   Total   |        59           130  |        189

Classified + if predicted Pr(D) >= .3
True D defined as low != 0
--------------------------------------------------
Sensitivity                     Pr( +| D)   64.41%
Specificity                     Pr( -|~D)   67.69%
Positive predictive value       Pr( D| +)   47.50%
Negative predictive value       Pr(~D| -)   80.73%
--------------------------------------------------
False + rate for true ~D        Pr( +|~D)   32.31%
False - rate for true D         Pr( -| D)   35.59%
False + rate for classified +   Pr(~D| +)   52.50%
False - rate for classified -   Pr( D| -)   19.27%
--------------------------------------------------
Correctly classified                        66.67%
--------------------------------------------------

. predict p  
(option p assumed; Pr(low))

. rocss low p  // compare the results    

     +------------------------------------------------------+
     | cutoff     sens     spec   omspec    cclass    carea |
     |------------------------------------------------------|
  1. |  0.000   1.0000   0.0000   1.0000   31.2169   0.0000 |
  2. |  0.100   1.0000   0.0769   0.9231   36.5079   0.0769 |
  3. |  0.200   0.8983   0.3692   0.6308   53.4392   0.3544 |
  4. |  0.300   0.6441   0.6769   0.3231   66.6667   0.5917 |
  5. |  0.400   0.4407   0.8615   0.1385   73.0159   0.6918 |
  6. |  0.500   0.2712   0.9000   0.1000   70.3704   0.7055 |
  7. |  0.600   0.1864   0.9462   0.0538   70.8995   0.7160 |
  8. |  0.700   0.0508   0.9846   0.0154   69.3122   0.7206 |
  9. |  0.800   0.0169   0.9923   0.0077   68.7831   0.7209 |
 10. |  0.900   0.0000   1.0000   0.0000   68.7831   0.7209 |
 11. |  1.000   0.0000   1.0000   0.0000   68.7831   0.7209 |
     +------------------------------------------------------+

Number of observations                       = 189
Number of probability cutoffs (10+1)         = 11
Area under ROC curve                         = 0.7209
Highest value of correctly classified        = 73.0159

. rocss low p, ncut(20) gr   

     +------------------------------------------------------+
     | cutoff     sens     spec   omspec    cclass    carea |
     |------------------------------------------------------|
  1. |  0.000   1.0000   0.0000   1.0000   31.2169   0.0000 |
  2. |  0.050   1.0000   0.0077   0.9923   31.7460   0.0077 |
  3. |  0.100   1.0000   0.0769   0.9231   36.5079   0.0769 |
  4. |  0.150   0.9661   0.1846   0.8154   42.8571   0.1828 |
  5. |  0.200   0.8983   0.3692   0.6308   53.4392   0.3549 |
  6. |  0.250   0.8475   0.5385   0.4615   63.4921   0.5026 |
  7. |  0.300   0.6441   0.6769   0.3231   66.6667   0.6059 |
  8. |  0.350   0.5424   0.8000   0.2000   71.9577   0.6789 |
  9. |  0.400   0.4407   0.8615   0.1385   73.0159   0.7091 |
 10. |  0.450   0.3220   0.8923   0.1077   71.4286   0.7209 |
 11. |  0.500   0.2712   0.9000   0.1000   70.3704   0.7231 |
 12. |  0.550   0.2542   0.9077   0.0923   70.3704   0.7252 |
 13. |  0.600   0.1864   0.9462   0.0538   70.8995   0.7336 |
 14. |  0.650   0.1017   0.9692   0.0308   69.8413   0.7370 |
 15. |  0.700   0.0508   0.9846   0.0154   69.3122   0.7381 |
 16. |  0.750   0.0339   0.9923   0.0077   69.3122   0.7385 |
 17. |  0.800   0.0169   0.9923   0.0077   68.7831   0.7385 |
 18. |  0.850   0.0169   0.9923   0.0077   68.7831   0.7385 |
 19. |  0.900   0.0000   1.0000   0.0000   68.7831   0.7385 |
 20. |  0.950   0.0000   1.0000   0.0000   68.7831   0.7385 |
 21. |  1.000   0.0000   1.0000   0.0000   68.7831   0.7385 |
     +------------------------------------------------------+

Number of observations                       = 189
Number of probability cutoffs (20+1)         = 21
Area under ROC curve                         = 0.7385
Highest value of correctly classified        = 73.0159

Roc curve #1

. rocss low p, saved(allsens)    

     +------------------------------------------------------+
     | cutoff     sens     spec   omspec    cclass    carea |
     |------------------------------------------------------|
  1. |  0.000   1.0000   0.0000   1.0000   31.2169   0.0000 |
  2. |  0.100   1.0000   0.0769   0.9231   36.5079   0.0769 |
  3. |  0.200   0.8983   0.3692   0.6308   53.4392   0.3544 |
  4. |  0.300   0.6441   0.6769   0.3231   66.6667   0.5917 |
  5. |  0.400   0.4407   0.8615   0.1385   73.0159   0.6918 |
  6. |  0.500   0.2712   0.9000   0.1000   70.3704   0.7055 |
  7. |  0.600   0.1864   0.9462   0.0538   70.8995   0.7160 |
  8. |  0.700   0.0508   0.9846   0.0154   69.3122   0.7206 |
  9. |  0.800   0.0169   0.9923   0.0077   68.7831   0.7209 |
 10. |  0.900   0.0000   1.0000   0.0000   68.7831   0.7209 |
 11. |  1.000   0.0000   1.0000   0.0000   68.7831   0.7209 |
     +------------------------------------------------------+

Number of observations                       = 189
Number of probability cutoffs (10+1)         = 11
Area under ROC curve                         = 0.7209
Highest value of correctly classified        = 73.0159

. rocss low p, ncut(80) gr saved(allsens) rep   

     +------------------------------------------------------+
     | cutoff     sens     spec   omspec    cclass    carea |
     |------------------------------------------------------|
  1. |  0.000   1.0000   0.0000   1.0000   31.2169   0.0000 |
  2. |  0.013   1.0000   0.0000   1.0000   31.2169   0.0000 |
  3. |  0.025   1.0000   0.0000   1.0000   31.2169   0.0000 |
  4. |  0.038   1.0000   0.0000   1.0000   31.2169   0.0000 |
  5. |  0.050   1.0000   0.0077   0.9923   31.7460   0.0077 |
  6. |  0.063   1.0000   0.0154   0.9846   32.2751   0.0154 |
  7. |  0.075   1.0000   0.0385   0.9615   33.8624   0.0385 |
  8. |  0.087   1.0000   0.0538   0.9462   34.9206   0.0538 |
  9. |  0.100   1.0000   0.0769   0.9231   36.5079   0.0769 |
 10. |  0.112   1.0000   0.1000   0.9000   38.0952   0.1000 |
 11. |  0.125   1.0000   0.1308   0.8692   40.2116   0.1308 |
 12. |  0.138   0.9661   0.1615   0.8385   41.2698   0.1610 |
 13. |  0.150   0.9661   0.1846   0.8154   42.8571   0.1833 |
 14. |  0.162   0.9492   0.2385   0.7615   46.0317   0.2349 |
 15. |  0.175   0.9153   0.2615   0.7385   46.5608   0.2564 |
 16. |  0.188   0.8983   0.3000   0.7000   48.6772   0.2913 |
 17. |  0.200   0.8983   0.3692   0.6308   53.4392   0.3535 |
 18. |  0.213   0.8814   0.4000   0.6000   55.0265   0.3808 |
 19. |  0.225   0.8644   0.4692   0.5308   59.2593   0.4413 |
 20. |  0.237   0.8644   0.5077   0.4923   61.9048   0.4745 |
 21. |  0.250   0.8475   0.5385   0.4615   63.4921   0.5008 |
 22. |  0.262   0.7966   0.5769   0.4231   64.5503   0.5325 |
 23. |  0.275   0.7458   0.6077   0.3923   65.0794   0.5562 |
 24. |  0.287   0.7119   0.6538   0.3462   67.1958   0.5898 |
 25. |  0.300   0.6441   0.6769   0.3231   66.6667   0.6055 |
 26. |  0.313   0.6271   0.7000   0.3000   67.7249   0.6201 |
 27. |  0.325   0.6102   0.7231   0.2769   68.7831   0.6344 |
 28. |  0.338   0.5932   0.7615   0.2385   70.8995   0.6576 |
 29. |  0.350   0.5424   0.8000   0.2000   71.9577   0.6794 |
 30. |  0.363   0.4915   0.8154   0.1846   71.4286   0.6874 |
 31. |  0.375   0.4746   0.8385   0.1615   72.4868   0.6985 |
 32. |  0.387   0.4576   0.8462   0.1538   72.4868   0.7021 |
 33. |  0.400   0.4407   0.8615   0.1385   73.0159   0.7090 |
 34. |  0.412   0.4068   0.8692   0.1308   72.4868   0.7123 |
 35. |  0.425   0.3729   0.8692   0.1308   71.4286   0.7123 |
 36. |  0.438   0.3729   0.8846   0.1154   72.4868   0.7180 |
 37. |  0.450   0.3220   0.8923   0.1077   71.4286   0.7207 |
 38. |  0.463   0.2881   0.8923   0.1077   70.3704   0.7207 |
 39. |  0.475   0.2881   0.8923   0.1077   70.3704   0.7207 |
 40. |  0.488   0.2881   0.9000   0.1000   70.8995   0.7229 |
 41. |  0.500   0.2712   0.9000   0.1000   70.3704   0.7229 |
 42. |  0.512   0.2712   0.9000   0.1000   70.3704   0.7229 |
 43. |  0.525   0.2712   0.9000   0.1000   70.3704   0.7229 |
 44. |  0.538   0.2712   0.9077   0.0923   70.8995   0.7250 |
 45. |  0.550   0.2542   0.9077   0.0923   70.3704   0.7250 |
 46. |  0.563   0.2373   0.9308   0.0692   71.4286   0.7306 |
 47. |  0.575   0.2203   0.9308   0.0692   70.8995   0.7306 |
 48. |  0.587   0.2034   0.9308   0.0692   70.3704   0.7306 |
 49. |  0.600   0.1864   0.9462   0.0538   70.8995   0.7336 |
 50. |  0.613   0.1864   0.9538   0.0462   71.4286   0.7351 |
 51. |  0.625   0.1356   0.9538   0.0462   69.8413   0.7351 |
 52. |  0.637   0.1017   0.9538   0.0462   68.7831   0.7351 |
 53. |  0.650   0.1017   0.9692   0.0308   69.8413   0.7366 |
 54. |  0.663   0.1017   0.9846   0.0154   70.8995   0.7382 |
 55. |  0.675   0.0508   0.9846   0.0154   69.3122   0.7382 |
 56. |  0.688   0.0508   0.9846   0.0154   69.3122   0.7382 |
 57. |  0.700   0.0508   0.9846   0.0154   69.3122   0.7382 |
 58. |  0.712   0.0508   0.9846   0.0154   69.3122   0.7382 |
 59. |  0.725   0.0508   0.9923   0.0077   69.8413   0.7386 |
 60. |  0.738   0.0508   0.9923   0.0077   69.8413   0.7386 |
 61. |  0.750   0.0339   0.9923   0.0077   69.3122   0.7386 |
 62. |  0.762   0.0339   0.9923   0.0077   69.3122   0.7386 |
 63. |  0.775   0.0339   0.9923   0.0077   69.3122   0.7386 |
 64. |  0.788   0.0169   0.9923   0.0077   68.7831   0.7386 |
 65. |  0.800   0.0169   0.9923   0.0077   68.7831   0.7386 |
 66. |  0.813   0.0169   0.9923   0.0077   68.7831   0.7386 |
 67. |  0.825   0.0169   0.9923   0.0077   68.7831   0.7386 |
 68. |  0.837   0.0169   0.9923   0.0077   68.7831   0.7386 |
 69. |  0.850   0.0169   0.9923   0.0077   68.7831   0.7386 |
 70. |  0.863   0.0000   0.9923   0.0077   68.2540   0.7386 |
 71. |  0.875   0.0000   0.9923   0.0077   68.2540   0.7386 |
 72. |  0.887   0.0000   1.0000   0.0000   68.7831   0.7386 |
 73. |  0.900   0.0000   1.0000   0.0000   68.7831   0.7386 |
 74. |  0.913   0.0000   1.0000   0.0000   68.7831   0.7386 |
 75. |  0.925   0.0000   1.0000   0.0000   68.7831   0.7386 |
 76. |  0.938   0.0000   1.0000   0.0000   68.7831   0.7386 |
 77. |  0.950   0.0000   1.0000   0.0000   68.7831   0.7386 |
 78. |  0.962   0.0000   1.0000   0.0000   68.7831   0.7386 |
 79. |  0.975   0.0000   1.0000   0.0000   68.7831   0.7386 |
 80. |  0.988   0.0000   1.0000   0.0000   68.7831   0.7386 |
 81. |  1.000   0.0000   1.0000   0.0000   68.7831   0.7386 |
     +------------------------------------------------------+

Number of observations                       = 189
Number of probability cutoffs (80+1)         = 81
Area under ROC curve                         = 0.7386
Highest value of correctly classified        = 73.0159

Roc curve #2
Draw your own graph using the new dataset.

Home  Author  Presentations  Publications  Software  Courses  Links