ASA 6 Answers - Event History Analysis
Document Details
Uploaded by ClearerKoala
Tags
Related
- Advanced Statistical Analysis: Event History Analysis 1
- Advanced Statistical Analysis: Event History Analysis 1
- Advanced Statistical Analysis: Event History Analysis 2
- Advanced Statistical Analysis: Event History Analysis 3
- Analysis of Padre Faura Witnesses the Execution of Rizal PDF
- Palmer Raids Student Materials PDF
Summary
This document is an assignment on event history analysis, likely for an undergraduate-level social science course. It includes instructions and exercises, focusing on using STATA software to analyze data.
Full Transcript
Assignment6EventhistoryanalysisThe aim of this exercise is to find out how to perform a Cox regression model with timeconstantexplanatoryvariableswiththeuseofstatisticalsoftwarepackageSTATA.Createa*.dofilewithSTATAsyntaxtoworktimeefficiently. STATAoutputtablesandfiguresshouldbeneatlypresentedinyourW...
Assignment6EventhistoryanalysisThe aim of this exercise is to find out how to perform a Cox regression model with timeconstantexplanatoryvariableswiththeuseofstatisticalsoftwarepackageSTATA.Createa*.dofilewithSTATAsyntaxtoworktimeefficiently. STATAoutputtablesandfiguresshouldbeneatlypresentedinyourWorddocument.IfyouexperienceanydifficultieswiththeassignmentorStata:1)First,Googleit(oruseStatahelp);2)Then,discussitwithyourfellowstudents;3)Last,askore-mailthe computerlabsupervisors(preferablyduringthecomputerlabsessions).Forallexercises:●Worktogetherinpairsoftwoandindicatebothstudents’namesontheworksheet. ●AlwayspasteyoursyntaxandtheappropriateoutputfromSTATAinyouranswerfile tosupportyouranswers. ●Wheneveranewcommandisused,explainthestructureandcontentofthecode.At alltimes,makesureyourworkistransparentandtraceable!Remembertosaveyourwork.●Upload the answers of your assignment to Brightspace when you are finished. The deadlineisonMonday09:00a.m.Studentnames:StudentIDs: Goodluckwithassignment6!1 ExerciseAUsetheStatasytemfile eventhisfileCox.dta.ThisfilecontainsinformationaboutrespondentsfromtworetrospectivesurveysheldintheNetherlands in the early 1990s: the SSCW-survey (also called Telepanel survey) and theNetherlandsFamilySurvey1993.Inthesesurveysextensiveinformationwasgatheredaboutthe respondents’ life histories. For this assignment we use information about the timing ofleavingtheparentalhomeandasmallnumberofbackgroundvariables.Of course, you are expected to include tables and plots in your report to support yourarguments.Lotsofsuccessandfundoingtheassignment!1.Makeanewvariableindicatingtheageofleavinghomeorcensoring,using informationabouttheyearofbirthandtheyearofleavinghome.Namethis variable“ageleft”.Declarethedatasettobesurvivaldatausingthestset command: thetime-variableis“ageleft” andthefailurevariableis“left”. ExplorefailuredistributionanduseKaplanMeieroftheageofleavinghometo answerthefollowingquestions.Usethecommand“stdescribe”forsummary statistics,andstsgraphforplotsofthesurvivalandsmoothedhazard:a.Whotendstoleavehomeearlier/faster:womenormen?b.Whatisthemedianageofleavinghomeforwomenandmen?c.Aroundwhichageisthehazardrateofleavinghomehighest,forwomenand men?genageleft=yearleft-birthyrlabelvariableageleft"Agewhenleavingparentalhome"stsetageleft,failure(left)sortsexbysex:stdescribe2 failures 1462 .9426177 0 1 1time at risk 35227 22.71244 16 21 73time on gap if gap 0 subjects with gap 0 (final) exit time 22.71244 16 21 73(first) entry time 0 0 0 0no. of records 1551 1 1 1 1no. of subjects 1551 Category total mean min median max per subject analysis time _t: ageleft failure _d: left-> sex = Female failures 1470 .9018405 0 1 1time at risk 40048 24.56933 16 23 67time on gap if gap 0 subjects with gap 0 (final) exit time 24.56933 16 23 67(first) entry time 0 0 0 0no. of records 1630 1 1 1 1no. of subjects 1630 Category total mean min median max per subject analysis time _t: ageleft failure _d: left-> sex = Male stsgraph,by(sex) 3 stsgraph,hazardby(sex) a)Womentendtoleavehomeearlier/faster.4 b)The median age of leaving home is 21 years for women and 23 years for men (seetabledescribingthesurvivaltimedatasetandKaplanMeierestimate).c)Thehazardrateofleavinghomeishighestintheearly/mid-20s(seehazardfunction).2.How about the proportionality assumption: is it justified, or is it violated, andhow? Theproportionalityassumptiondoesnotseemjustified.Beforeage32/33,womenaremorelikelytoleavehome.Afterwards,menaremorelikelytoleave.Aboveage50,hazardsgoup.Thisdoesnotmakemuchsense;itisbettertotruncatetheanalysismuchearlier.3.RunaCoxregressionofleavinghomewithsexastheonlyindependentvariable.What is the estimated difference in the hazard function between women andmen?Considerwhetheritisnecessarytoaccountforclustering.stcoxi.sex,vce(clusternohhold) Female 1.481069 .0487977 11.92 0.000 1.38845 1.579867 sex _t Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] Robust (Std. Err. adjusted for 1,944 clusters in nohhold)Log pseudolikelihood = -21114.561 Prob > chi2 = 0.0000 Wald chi2(1) = 142.11Time at risk = 75275No. of failures = 2,932No. of subjects = 3,181 Number of obs = 3,181Cox regression -- Breslow method for tiesIteration 0: log pseudolikelihood = -21114.561Refining estimates:Iteration 2: log pseudolikelihood = -21114.561Iteration 1: log pseudolikelihood = -21114.598Iteration 0: log pseudolikelihood = -21169.894 analysis time _t: ageleft failure _d: leftKeeping all the other covariates constant, the hazard of leaving home for women is 48%higherthanformen.Or,inotherwords,thehazardofwomenleavingtheparentalhomeisestimatedtobe1.48timesashighasthehazardofmen.Theestimateisinfluencedmostbythesexdifferenceatyoungerages. 4.Toanalysehowleavingtheparentalhomehaschangedbetweenbirthcohorts:a.Run another Cox regression, but now include birth cohort as an additionalindependentvariablebesidessex.Explainhowleavingtheparentalhomehaschangedbetweenbirthcohorts.b.Plot the predicted cohort survival curves for females using the stcurvecommand. Compare them to the empirical survival curves using sts graph,andinterpretthedifference. 5 stcoxi.sexi.cohno,vce(clusternohhold) nohrbasesurv(surv0) 1965-74 .7517968 .1121559 6.70 0.000 .5319752 .9716183 1955-64 .8262782 .0989664 8.35 0.000 .6323075 1.020249 1945-54 .6847063 .1015396 6.74 0.000 .4856924 .8837202 1935-44 .3625264 .1015541 3.57 0.000 .1634841 .5615688 1925-34 .0913412 .0996504 0.92 0.359 -.1039699 .2866523 cohno Female .422586 .0344428 12.27 0.000 .3550794 .4900926 sex _t Coef. Std. Err. z P>|z| [95% Conf. Interval] Robust (Std. Err. adjusted for 1,944 clusters in nohhold)Log pseudolikelihood = -21006.156 Prob > chi2 = 0.0000 Wald chi2(6) = 338.26Time at risk = 75275No. of failures = 2,932No. of subjects = 3,181 Number of obs = 3,181Cox regression -- Breslow method for tiesIteration 0: log pseudolikelihood = -21006.156Refining estimates:Iteration 3: log pseudolikelihood = -21006.156Iteration 2: log pseudolikelihood = -21006.157Iteration 1: log pseudolikelihood = -21007.298Iteration 0: log pseudolikelihood = -21169.894 analysis time _t: ageleft failure _d: leftThe risk of leaving parental home is greater for the younger birth cohorts. For example,compared to the oldest birth cohort 1900-24 (ref), the birth cohorts 1955-64 and 1965-74significantly increase the chance of leaving house by 0.826 points and 0.752 pointsrespectively. Nowweplotpredictedsurvivalinfemalesofdifferentcohorts.stcurve,survivalat1(cohno=1sex=1)at2(cohno=2sex=1)at3(cohno=3sex=1) at4(cohno=4sex=1)at5(cohno=5sex=1)at6(cohno=6sex=1)Note:stcurveplotsthefunctionatthemeanofunspecifiedcovariates.SoifourCoxmodelhadincludedmorevariablesthanjustcohnoandsex,stcurvewouldhaveplottedthesurvivalfunctionatthemeanvaluesoftheseothervariables.Thisistheadvantageofusingstcurveovermanualcalculationofthesurvivalorhazardfunctions,whenyouhavemorecovariatesthanyouplotseparatelinesfor.6 0.2.4.6.81 Survival203040506070analysistimecohno=1,sex=1cohno=2,sex=1cohno=3,sex=1cohno=4,sex=1cohno=5,sex=1cohno=6,sex=1Coxproportionalhazardsregressionstsgraphifsex==1,by(cohnoThe difference between the two graphs is that the predicted survival is forced intoproportionality.Thisisnotthecasefortheempiricalsurvival.7 5.Thedatasetalsocontainsinformationaboutfinallyachievedlevelofeducation:atime-constantvariablemeasuredatthemomentofinterview.Nowtakealookat the variables ‘year finished education’ and ‘year of interview’. Why is it,strictly speaking, not appropriate to include final level of education in the Coxregression?Education is time-varying and the educational level at the time of leaving home might belowerthantheeducationallevelatthetimeofinterview.Itcanbeassumedthatthelevelofeducation at the time of potentially leaving the parental home is crucial in the decisionwhetherleavingornot.Theageatwhichchildrenleavetheparentalhomeisusuallywheneducationisfinished,butthis is often not the case for university educated persons. Some may leave the parental inorder to enrol in education somewhere else, others stay at home during the(professional/university) education period and only leave home when taking up a job orwhenstartingtolivewithapartner.Alookatthevariable‘yearfinishededucation’showsthat5%oftherespondentswerestillineducationatthetimeoftheinterview.Usingtheinformationon‘finallevelofeducation’mayviolatetheassumptionofcausalityintheanalysis.Thisproblemissometimesalsoreferredtoas“anticipatoryanalysis”.Blossfeld and Rohwer (2002) have devoted chapter 1.2 on “Event History Analysis andCausal Modeling”, where the use time-dependent and time-independent variables ismentioned.6.Include the variable anyway. What do you find? Variable edlong has valuesoutsidetherangeof1-4.Dropthesecases.Howwouldyouinterpretthefindings,givenyouranswertoQuestion5?dropifedlong<=0|edlong>4 stcoxi.sexi.cohnoi.edlong,vce(clusternohhold)Theeffect ofthis variable has to beinterpreted withcare. A possibleinterpretationis thatpeoplewithahigherfinallevelofeducationtendtohaveagreaterriskofleavingparentalhome,perhapsbecausetheywanttoachievethatfinallevel.Itishowevernotpossibletosaythat higher education is the cause of the higher risk of leaving home. Having only time-constant information about the educational level, it is not known which event precedes theother.Inotherwords,wedonotknowwhetherhigheducationprecedesleavingparentalorwhetherleavingparentalhomeprecedeshighereducation.8 higher secu..) .2802169 .0636552 4.40 0.000 .155455 .4049788upper sec/l..) .0649361 .0592547 1.10 0.273 -.0512009 .1810731lower secun..) -.0861302 .0532381 -1.62 0.106 -.1904749 .0182145 edlong 1965-74 .7218987 .1176751 6.13 0.000 .4912598 .9525376 1955-64 .7947564 .1043453 7.62 0.000 .5902435 .9992694 1945-54 .6666931 .1062423 6.28 0.000 .4584621 .8749242 1935-44 .3620078 .1061291 3.41 0.001 .1539985 .5700171 1925-34 .094824 .104068 0.91 0.362 -.1091456 .2987935 cohno Female .4326109 .0351521 12.31 0.000 .3637141 .5015077 sex _t Coef. Std. Err. z P>|z| [95% Conf. Interval] Robust (Std. Err. adjusted for 1,925 clusters in nohhold)Log pseudolikelihood = -20675.408 Prob > chi2 = 0.0000 Wald chi2(9) = 365.83Time at risk = 74101No. of failures = 2,894No. of subjects = 3,139 Number of obs = 3,139Cox regression -- Breslow method for tiesIteration 0: log pseudolikelihood = -20675.408Refining estimates:Iteration 3: log pseudolikelihood = -20675.408Iteration 2: log pseudolikelihood = -20675.408Iteration 1: log pseudolikelihood = -20676.865Iteration 0: log pseudolikelihood = -20857.716 analysis time _t: ageleft failure _d: left9 OPTIONALExerciseB(buthighlyrecommendedtotry!)IntroductionTopic:Jobmobility Research question: How is the likelihood of quitting jobs related to individual and jobcharacteristics?Dataset:tda.dtaThevariablesinthisdatasetare:ididofindividualnojserialnumberofjobtsartstartingtimeofjobtfinendingtimeofjobsex1=men,2=womentidateofinterviewtbdateofbirthTEdateofentryintolabourmarkettmdateofmarriagepresprestigeofjobipresnprestigeofjobi+1eduyearsofformaleducationtfpdurationofjobepisodedescensoring(0=censored,1=notcensored)ReportDownload the Stata data file tda.dta from Nestor to a new working directory and open thedata file. The tda data file will be used to determine the hazard rates of changing jobs(mobility)basedonasetofvariables. 1.Explorethefollowingvariables:tfp,des,sexusingfrequencies,meansetc.Geta feelfortheorganisationandqualityofthedata.Recodethevariablesexinto0= menand1=women.sumtfpdessextab1tfpdessexinspect Note:withtab1youwillgetaone-waytableforeachvariablelisted.Theregularcommandtab(=tabulate)willnotworkformorethantwovariables. sex 600 1.42 .4939703 1 2 des 600 .7633333 .4253906 0 1 tfp 600 67.97 78.36633 2 428 Variable Obs Mean Std.Dev. Min Max 5 4 0.67 3.83 4 9 1.50 3.17 3 7 1.17 1.67 2 3 0.50 0.50 job episode Freq. Percent Cum.duration of (andmore…)10 Total 600 100.00 notcensored 458 76.33 100.00 censored 142 23.67 23.67 censoring Freq. Percent Cum. Total 600 100.00 women 252 42.00 100.00 men 348 58.00 58.00 2 women) Freq. Percent Cum.sex (1 men, recodesex(1=0"men")(2=1"women"),pre(new)Note:thisway,youformanewvariablewiththesamenameastheoldone,butwithprefix “new”.Sonowyouhavetwovariables:sexandnewsex.Insteadofpre(new),youcoulduse theoptiongen(newsex),whichwouldgiveyouthesamething. tabsexnewsex Total 348 252 600 women 0 252 252 men 348 0 348 women) men women Total men, 2 men, 2 women)) sex (1 RECODE of sex (sex (1. tab sex newsexThecrosstableaboveshowsthatthenewlyrecodedvariableiscorrect. 2.Set the database as being survival data, where the time-variable is tfp and thefailure-variableisdes(usecommandstset). stsettfp, failure(des)Note: as default, Stata will assume that des==0 or missing means censored and all othervalues(inourcase,value1)areinterpretedasrepresentingfailure.Thiscanalsobereadinthehelpfileofstset.Wecallitfailurewhentheeventofinterestoccurs.Inourdataset,thisdefaultsettingisthecorrectone,because des==0iscensoredanddes==1isnotcensored. 3.PerformaCoxregressionwiththecovariateSex.Computeboththehazardratioandthecoefficients.Makesuretoclusterthestandarderrorsacrosstheid.Whatis the parameter value of sex? Is it statistically significant? What is theinterpretationofthisparameter? stcoxi.newsex,vce(clusterid)11 women 1.52715 .1513132 4.27 0.000 1.257601 1.854473 newsex _t Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] Robust (Std. Err. adjusted for 201 clusters in id)Log pseudolikelihood = -2574.7456 Prob > chi2 = 0.0000 Wald chi2(1) = 18.26Time at risk = 40782No. of failures = 458No. of subjects = 600 Number of obs = 600Cox regression -- Breslow method for tiesIteration 0: log pseudolikelihood = -2574.7456Refining estimates:Iteration 2: log pseudolikelihood = -2574.7456Iteration 1: log pseudolikelihood = -2574.763Iteration 0: log pseudolikelihood = -2584.5701 analysis time _t: tfp failure _d: desThehazardratioforsexis1.52715,anditseffectisstatisticallysignificantlydifferentfromzero.Inthiscase,menarethereferencecategory,soitistheratioofthehazardoffemales/hazardofmales.Soifwewouldmultiplythebaselinehazardofmenwith1.52715,wewouldobtain the hazard of women. The model indicates that keeping all the other variablesconstant, women have a 52.72% higher hazard of changing jobs than men. This hazardratio is the exp(beta) and is the multiplicative effect of the independent variable on thehazard. stcoxi.newsex,vce(clusterid)nohrNote:byspecifyingnohr,wetellStatatoestimatecoefficients,nothazardratios. 12 women .4234031 .0990821 4.27 0.000 .2292059 .6176004 newsex _t Coef. Std. Err. z P>|z| [95% Conf. Interval] Robust (Std. Err. adjusted for 201 clusters in id)Log pseudolikelihood = -2574.7456 Prob > chi2 = 0.0000 Wald chi2(1) = 18.26Time at risk = 40782No. of failures = 458No. of subjects = 600 Number of obs = 600Cox regression -- Breslow method for tiesIteration 0: log pseudolikelihood = -2574.7456Refining estimates:Iteration 2: log pseudolikelihood = -2574.7456Iteration 1: log pseudolikelihood = -2574.763Iteration 0: log pseudolikelihood = -2584.5701 analysis time _t: tfp failure _d: desThecoefficientresultisthelog(hazardratio),soln(1.52715),andshowsthattheloghazardofchangingjobis+0.4234pointshigherforwomenthanitisformen. 4.Plotthepredictedsurvivalcurvesformalesandfemalesbya.predictingsurvivalvariablesformalesandfemales,andb.plottingbothpredictionsasalinegraph(command:line).Howdoyouinterpretthetwosurvivalcurvesofmalesandfemales? helplinestcoxi.sex,vce(clusterid)nohrbasesurv(surv0)generatesurv1=surv0^exp((0.4234031))labelvariablesurv0"men"labelvariablesurv1"women"linesurv0surv1_t,c(JJ)sortNote:basesurvcalculatesandsavesthebaselinesurvivorfunction,soatthereference categories,inthiscasemen. Note:_tistheanalysistimewhentherecordends.Option connect(c) incommandlinetells Statatoconnectpoints.Therearedifferentoptions,withJmeaningstairstep:flat,then vertical.OptionsorttellsStatatosortthedatabytheXvariablebeforeconnecting. 13 Thefemalecurveislowerthanthemalecurve:females‘survive’shorterinajob,meaningthey switchfaster between jobs (higher mobility). For instance,after 100 months about 35percentofmalesisestimatedtobestillinthejob,againstonlyabout20percentoffemales(estimatedfromthecurves). 5.Generatethevariable“cohort” from“tb”bycreating -value2(cohort1939-1941)fortb>=468&tb<=504, -value3(cohort1949-1951)fortb>=588&tb<=624, -value1forallothervalues(othercohorts). Furthermore, generate “lfx” as the difference between the starting time of thejobandthedateofentrytothelabormarket. Generate“pnoj” astheserialnumberofjobminus1. Using descriptive statistics, explore these new variable as well as edu and pres.Whichvariablesarecategoricalandwhichvariablesarecontinuous?genlfx =tstart-TEgenpnoj=noj-1recodetb(468/504=2"1939-41")(588/624=3"1949-51")(nonmissing=1"othercohorts"),gen(cohort) sumcohortlfxpnojedupres14 tabulatecohortCohortisacategoricalvariable.Theothervariablesaretreatedascontinuous. 6.Run another Cox regression. Remove the variable “sex” from the list ofcovariates and include the variables “edu, cohort, lfx, pnoj and pres”. Specifywhichvariablesarecategoricalandchooseareferencecategorythatmakesmostsense in terms of ease of interpretation. Estimate the model with coefficients.What is the interpretation of the parameters? Does the log likelihood improvesignificantlybytheinclusionofthevariables?15 pres -.0261678 .0060146 -4.35 0.000 -.0379561 -.0143795 pnoj .0686267 .0389654 1.76 0.078 -.007744 .1449975 lfx -.003989 .0010417 -3.83 0.000 -.0060307 -.0019474 1949-51 .3052514 .122002 2.50 0.012 .0661318 .544371 1939-41 .4113074 .1211748 3.39 0.001 .1738091 .6488057 cohort edu .0668593 .0270978 2.47 0.014 .0137486 .11997 _t Coef. Std. Err. z P>|z| [95% Conf. Interval] Robust (Std. Err. adjusted for 201 clusters in id)Log pseudolikelihood = -2546.7756 Prob > chi2 = 0.0000 Wald chi2(6) = 53.64Time at risk = 40782No. of failures = 458No. of subjects = 600 Number of obs = 600Cox regression -- Breslow method for tiesIteration 0: log pseudolikelihood = -2546.7756Refining estimates:Iteration 4: log pseudolikelihood = -2546.7756Iteration 3: log pseudolikelihood = -2546.7756Iteration 2: log pseudolikelihood = -2546.7823Iteration 1: log pseudolikelihood = -2548.0198Iteration 0: log pseudolikelihood = -2584.5701 analysis time _t: tfp failure _d: des ThereferencecategoryforCOHORTisthefirstone‘othercohorts’aswewanttoestimatetheeffectsspecificallyforthosepeoplebornin1939-41and1949-51.TheWaldchi2valueof53.64ishighlysignificant(p=0.0000)indicatingthattheinclusionofthevariablessignificantlyimprovesthemodel(improvesthelogpseudolikelihood). Note: we nowobtaina Wald chi2 instead of LR.Most important is that you know that thenullhypothesisthatWaldtestsisthatthecoefficientsofinterestaresimultaneouslyequaltozero,soyouinterpretthe“Prob>chi2”thesamewayaswithLR.WerejectthisH0becauseprob=0.0000.a.Oneadditionalschoolyearincreasestheloghazardofleavingby0.066points.thehighereducatedaremoremobile.b.The younger cohorts are more mobile, although this relationship appears to benonlinear as the middle cohort is more mobile than the youngest group. The loghazardis0.41pointshigherforcohort1939-41comparedtoothercohorts,and0.31pointshigherforcohort1949-51comparedtoothercohorts. c.Jobmobilitydecreasesby0.004pointsforeveryyearextraoflaborforceepxerience.d.PNOJisonlysignificantata10%confidencelevel.Wewouldhaveexpectedthatifsomeone tends to move between jobs (evidenced by a high serial number of the16 currentjob)thatwouldhaveapositiveeffectonthatpersonsexitrate.Thepositivecoefficientdoessuggestsuchrelation,althoughweak.e.Peopleinmoreprestigejobsarelessmobile(asexpected).Foreveryunitincreaseinjobprestige,theloghazardofjobmobilitydecreaseswith0.03points.7.Now addthevariableSEXtothemodel.Istheeffectofsexstillimportantandcomparable to the earlier model? Do the other effects change (sign, size andsignificance)asaresultoftheinclusionofsex? women .3698895 .0982921 3.76 0.000 .1772404 .5625385 newsex pres -.0249239 .0056655 -4.40 0.000 -.0360281 -.0138197 pnoj .0903056 .0408095 2.21 0.027 .0103205 .1702907 lfx -.0040866 .0010363 -3.94 0.000 -.0061177 -.0020555 1949-51 .2958657 .1218758 2.43 0.015 .0569934 .5347379 1939-41 .38596 .1128203 3.42 0.001 .1648364 .6070837 cohort edu .0763211 .0251799 3.03 0.002 .0269695 .1256728 _t Coef. Std. Err. z P>|z| [95% Conf. Interval] Robust (Std. Err. adjusted for 201 clusters in id)Log pseudolikelihood = -2539.6788 Prob > chi2 = 0.0000 Wald chi2(7) = 94.07Time at risk = 40782No. of failures = 458No. of subjects = 600 Number of obs = 600Cox regression -- Breslow method for tiesIteration 0: log pseudolikelihood = -2539.6788Refining estimates:Iteration 4: log pseudolikelihood = -2539.6788Iteration 3: log pseudolikelihood = -2539.6788Iteration 2: log pseudolikelihood = -2539.6867Iteration 1: log pseudolikelihood = -2541.043Iteration 0: log pseudolikelihood = -2584.5701 analysis time _t: tfp failure _d: desThecoefficientofsex(0.37)iscomparabletothefittedcoefficientinthefirstmodel(0.42),andstatisticallysignificant.Theconfidenceintervalsofthecoefficientofthepriormodelandinthismodeloverlap,sotheyarenotsignificantlydifferent. Theeffectsoftheothervariablesdonotchangesignandchangejustatinybitinsize.pnojchangesmostinsizeandbecomessignificantatthe0.05level.Apparentlytheeffectofthisvariablewasmaskedsomewhatbynotincludingsex.17 8.Make an LML plot (log-minus-log of survival function, command stphplot)based on each of the categorical variables using the variable of interest as astratumvariableafterexcludingitfromthemodel.Thisensuresyouwillgetanempiricalestimateofthedifferencebetweenthecategoriesratherthanamodeloutcome. Do you have indications for one or more variables if theproportionalityassumptiondoesnothold?stcoxedui.cohortlfxpnojpres,vce(clusterid) stphplot,strata(sex)adjust(cohort)Note:optionadjust adjuststoaveragevaluesofothercovariatesstcoxedu lfxpnojpresi.sex,vce(clusterid) stphplot,strata(cohort)adjust(sex)18 The LML functions for sex are largely parallel, although less so for short durations. TheLMLfunctionsforcohortarealsolargelyparallel,butthislooksbestformiddledurations.My own (Clara Mulder’s) conclusion from these plots would be that the proportionalityassumptionisvalidtoareasonableextentandthemodeloutcomescanbetrusted.19