Nev­er­the­less, impor­tant guide­lines have appeared in the lit­er­a­ture. Per­haps the first Lan­dis and Koch[13] stat­ed that the val­ues < 0 were unseable and 0–0.20 as light, 0.21–0.40 as just, 0.41–0.60 as mod­er­ate, 0.61–0.80 as a sub­stan­tial agree­ment and 0.81–1 almost per­fect. How­ev­er, these guide­lines are not uni­ver­sal­ly accept­ed; Lan­dis and Koch did not pro­vide evi­dence, but relied on per­son­al opin­ion. It was found that these guide­lines could be more harm­ful than use­ful. [14] Fleiss‘[15]:218 Equal­ly arbi­trary guide­lines char­ac­ter­ize Kap­pas beyond 0.75 as excel­lent, 0.40 to 0.75 as just to good and less than 0.40 bad. Cohens coef­fi­cient Kap­pa () is a sta­tis­tic used to mea­sure reli­a­bil­i­ty between advi­sors (and also the reli­a­bil­i­ty of inter-raters) for qual­i­ta­tive (cat­e­gor­i­cal) ele­ments. [1] It is gen­er­al­ly accept­ed that this is a more robust indi­ca­tor than a sim­ple per­cent­age of the agree­ment cal­cu­la­tion, since the pos­si­bil­i­ty of a ran­dom agree­ment is tak­en into account. There are con­tro­ver­sies around Cohens Kap­pa because of the dif­fi­cul­ty of inter­pret­ing the indi­ca­tions of the agree­ment. Some researchers have sug­gest­ed that it is eas­i­er, con­cep­tu­al­ly, to assess dif­fer­ences of opin­ion between objects. [2] For more details, see Restrictions.

Once your com­pa­ny has a more agile con­tract analy­sis and the results are dis­played in your KPI, you may feel more con­fi­dent about tak­ing on new busi­ness oblig­a­tions. Set­ting up new offers will be much eas­i­er if you have your entire con­tract library at your fin­ger­tips dur­ing nego­ti­a­tion. You can then work more effi­cient­ly and take bold­er steps in grow­ing your busi­ness. In this exam­ple, a repeata­bil­i­ty assess­ment is used to illus­trate the idea, and it also applies to repro­ducibil­i­ty. The fact is that many sam­ples are need­ed to detect dif­fer­ences in an analy­sis of the attribute, and if the num­ber of sam­ples is dou­bled from 50 to 100, the test does not become much more sen­si­tive. Of course, the dif­fer­ence that needs to be iden­ti­fied depends on the sit­u­a­tion and the lev­el of risk that the ana­lyst is pre­pared to bear in the deci­sion, but the real­i­ty is that in 50 sce­nar­ios, it is dif­fi­cult for an ana­lyst to think that there is a sta­tis­ti­cal dif­fer­ence in the repro­ducibil­i­ty of two exam­in­ers with match rates of 96 per­cent and 86 per­cent. With 100 sce­nar­ios, the ana­lyst will not be able to see any dif­fer­ence between 96% and 88%. Repeata­bil­i­ty and repro­ducibil­i­ty are com­po­nents of accu­ra­cy in an analy­sis of the attribute mea­sure­ment sys­tem, and it is advis­able to first deter­mine if there is a pre­ci­sion problem.