Forecasting and Predictive Modelling: Innovative Statistical Analyses of Judicial Data
Arunav Kaul
Gaurav Banerjee
Introduction
A few months ago, several media outlets reported that a group of students from IIT-Madras had developed a software tool that can predict crowd behaviour and could be used by the Indian Army to deal with stone pelters in Jammu and Kashmir.1 Using crowd density maps and live images, the tool predicts abnormal events, such as stone pelting. The utility of such predictive models is indeed immense. In today’s world, predictive modelling and forecasting has become the norm. From banks to credit card companies, nearly every institution uses one form of prediction model or other to take better-informed decisions. Discovering patterns in large data sets and analysing trends are important for any institution to sustain itself and progress. In such a scenario, why should judicial data be left out of such analyses?
The Indian judiciary is a large institution which processes lakhs of cases across the country each year. With tons of case-related data being generated online on a daily basis, can it not be used to help predict case life, which, in turn, can be used by the judges to make better decisions? Unfortunately, judicial delays are on the rise. While justice, equality, and good conscience are three sacrosanct terms that one associates with the judiciary, there are also three other words with which our judiciary is being associated. These are pendency, delay, and backlog. It is no surprise that one affects the other and certainly in an adverse manner. Until the issue of chronic delay and mounting backlog are not adequately addressed from the Indian judicial system, principles like justice and equality will always be affected.
The objective of this chapter is to showcase the manner in which judicial data can be used to forecast and deconstruct the working of the court. While analysing the working of the courts, the chapter attempts to study the different variables that play an important part in the life cycle of a case. With the help of the variables identified, this chapter focuses on:
- Time series model that forecasts the number of cases that will be filed in the coming years;
- Survival analysis that helps in providing the probability of cases pending at different time periods;
- Understanding the relationship of various stages with case life and highlighting their impact through correlations.
Current Scenario
Numerous reports on the problems of judicial delay by the Law Commission of India and various other committees from the 1950s to the current times is a testimony to the fact that judicial delay is not a new topic of concern. Increasing pendency was a concern then and remains a concern even now, as the number of cases being filed and pending every year seem to be on the rise. To illustrate this, Figure 4.2.1 provides the total number of cases filed and pending in subordinate courts across the country. This data gathered pertains to the years between 2009 and 2017.
As one may clearly note, the number of cases filed over the years have been increasing with years 2015–2016 and 2016–2017 witnessing a sudden escalation in the number of filings. In 2017 alone, nearly two crore cases were filed across the country. The trend for pending cases too seems to
figure 4.2.1.Number of Cases Filed (Above) and Pending (Below) in Subordinate Courts 2009–2017

be similar, as the number is found to increase in 2016 and 2017. Numbers noted as of 22 January 2019 are no different as per National Judicial Data Grid (NJDG)—2.85 crore cases were pending in subordinate courts across the country.
Rapid Change Due to Technology
On 7 August 2013, a watershed event occurred for the Indian judiciary when the Chief Justice of India inaugurated the national e-courts website for subordinate courts. Perhaps, one of the most important steps towards better court and case management was the launch of the e-courts website. The website provides details of cases registered in courts in different parts of the country. Information related to cases listed in the court, stages, and even copies of daily order sheets are fed into the e-courts system by court staff in different parts of the country. Not only do these details keep the litigants and advocates better informed, the data entered can be ‘regularly analysed for meaningful assistance in policy formation and decision making’.2
This chapter utilises the variables provided on the e-courts website to carry out a variety of analyses. In order to analyse the data in depth, it was important to select a particular district from where the cases could be examined. Given the availability of a large sample size and better data quality, we chose the courts in Mysuru district.
Data Description
The data for Mysuru district was extracted from the e-courts website between 28 June 2018 and 1 July 2018. Out of a total of 21 court establishments3 in Mysuru, data from 19 court establishments was extracted for analysis in this chapter.4 Figure 4.2.2 below provides a snapshot of the sample size.
figure 4.2.2.Sample Size

Before proceeding with an in-depth analysis of cases in Mysuru, it is important to provide an overall trend of how cases progress and highlight the courts’ performance. Accordingly, Figure 4.2.3 shows the average number of days to disposal and average number of days for which cases have been pending in Mysuru.
Overall, the average days to disposal of civil cases is higher than that of criminal cases in Mysuru. However, in civil cases, the average pendency is higher than average days to disposal. A similar trend is observed even for criminal cases. This may be due to the fact that cases tend to get disposed quickly, failing which they tend to remain pending for longer periods of time. This particular trend has also been observed in other courts, as noted in DAKSH’s previous reports.5 While Figure 4.2.3 focuses on the average days, Table 4.2.1 depicts the percentage of cases that have been getting accumulated over the years.
figure 4.2.3.Average Disposal and Pendency (in Years) of Civil and Criminal Cases

Table 4.2.1.Percentage of Cases Pending as of 1 July 2018

The figures in Table 4.2.1 indicate that 16 per cent of cases filed in 2013 were still pending as of 1 July 2018 and 21 per cent of cases filed in 2014 were still pending as of 1 July 2018. It is, therefore, important to focus on long-pending cases and ensure that cases do not get lost in judicial limbo.
Identifying Cases Filed Through a Time Series Model
One often sees news channels and environmental agencies predicting rainfall or determining future air quality readings.6 While most of us would only pay attention to the forecast, if we were to take a step back and consider how one comes up with these kinds of projections, the answer would be a time series model. In simple terms, a time series model is a collection of observations recorded sequentially in time.7 These observations are used as data points to study trends across different time periods. The use of a time series is not merely limited to studies regarding the environment. They are used in several other fields, such as in economics to assess unemployment trends, in finance for forecasting exchange rates or predicting share prices in the market, etc.8 One of the advantages of a time series model is that it helps in forecasting trends and patterns that can help in taking informed steps.
Can a time series model be built using judicial data to project future institution of cases? A study carried out on the US Federal District Courts using cases filed from 1904 to 1998 studies the growth rate of cases filed across the years and projects the number of filings between 2000 and 2020.9 Although it would be extremely difficult to gather such historical data in the Indian context, recent data available in the public domain can be studied to provide a forecast for the coming years.
Projections of Case Filings in Mysuru
Before proceeding with the forecast, let us take a look at the historical trend of cases filed in Mysuru. Figure 4.2.4 depicts the total number of cases filed in each of the months in Mysuru district, starting from January 2015 until December 2018.10
figure 4.2.4.Total Number of Cases Filed between 2015 and 2018

There is a gradual increment in the number of cases filed over the years in Mysuru, with July and December 2018 experiencing the most number of filings. Interestingly, there is a spike in filings in the months of June and July in every year, while there is a dip in case filings in April and May every year. This sudden spike can be attributed to court vacations. Subordinate courts in Mysuru remain closed on account of vacations from end of April to May. Once courts reopen, the next two months witness a marked increase in case filings. To get an in-depth view of the historical filings, Figure 4.2.5 further analyses the data from different perspectives.
figure 4.2.5.Categories of Patterns of the Historical Data

Figure 4.2.5 highlights the various components of a time series model. The figure examines the historical data through three important components, namely, trend, seasonality, and randomness. This is known as decomposition of the time series data. The first graph—the observation—is the raw data as shown previously in Figure 4.2.4. The second graph—the observation—is an increase and decrease of values in the data set.11 One may note a gradual increase in the filings over the years in Mysuru. The third graph— seasonal—highlights the repeating cycles in the data set.12 For instance, case filings dipping in the months of April and May and spiking in June and July due to court vacations can be easily captured through the seasonality chart. Lastly, randomness, as the name suggests, shows the random variations in the data set—for instance, there would be certain months in which case filings would randomly increase or decrease with no particular trend.13
Using the Holt-Winters filtering model (explanation with equation is in Annexure A),14 we forecast the number of cases that will be instituted in the coming years in Mysuru.
figure 4.2.6.Forecast of Number of Case Filings in Mysuru for the Period between 2019–2022

Figure 4.2.6 projects the cases that will be filed in Mysuru from 2019–2022. The blue line indicates the actual projected filings (also known as point forecast) over the next four years, while the outer grey lines are the higher and lower range of filings. Case filings in the time series model have been projected with a confidence interval of 80 per cent. This means that that the model can predict future filings with an accuracy of 80 per cent. One may note that over the years, there is a gradual increase in case filings with certain months seeing as many as Table 4.2.2. Forecasted Case Filings Year Forecasted case filings (point forecast) Forecasted lower range (minimum) Forecasted higher range (maximum) 2019 65,496 51,158 79,829 2020 75,664 53,977 97,353 2021 85,836 58,755 1,12,916 2022 96,004 64,448 1,27,559 Source: Authors’ calculations. 8,000 cases filed. Table 4.2.2 shows the predicted filings along with the higher and lower ranges between which cases will be filed.
To test the accuracy of the model, we fed the number of cases filed between 2015–2017 and forecasted the number of cases that would be filed in 2018. These forecasted numbers were compared with the actual number of cases filed in different quarters in 2018. Table 4.2.3 shows the comparison.
figure 4.2.2.Forecasted Case Filings

figure 4.2.3.Testing the Predicted Values

Table 4.2.3 tests the accuracy of the time series model used in the chapter. All the cases filed in 2018 fall within the forecasted ranges (upper and lower) barring Quarter 2. In terms of forecasted number, all the different quarters are close to the point forecast, barring Quarter 2 where the difference is comparatively high. The forecasted numbers can help in giving the judge as well as the registry an assessment of the total cases that the court may expect to be filed. Requirements for adequate infrastructure, staff, and processes can be accordingly kept in place in view the incoming case load.
Studying Life cycle of Cases
Studying the life cycle of cases and understanding the manner in which they proceed through different stages is essential to identify bottlenecks in the system. To effectively study the life cycle of cases and their various attributes, cases have been divided into civil and criminal categories. Due to difference in procedures, stages, and subject matter, it was considered necessary to analyse them separately.
It must be noted that a typical criminal case emanating from a police station and being tried in the court would go through certain stages, starting from cognisance of the chargesheet to framing of charges, evidence, arguments, and then the final judgment. Figure 4.2.7 depicts the broad stages that constitute the progress of criminal cases.
While the stages provided in Figure 4.2.7 may be true for certain cases, there are some cases that do not pass through the evidence stage at all. This may be due to several reasons, for instance,
figure 4.2.7.Stages in Criminal Cases

if the accused pleads guilty to the charges or confesses to the crime, the evidence stage is dispensed with. Further, in certain types, such as criminal appeals or miscellaneous criminal appeals, there is no evidence stage. As a general practice, once the case goes on appeal, the court focuses primarily on the question of law and does not revisit the evidence, as the entire evidence stage would have already been completed during proceedings in the trial court. Hence, barring certain exceptional cases, appeals do not go through the evidence stage.
The evidence stage forms an important part of the life cycle of a criminal case. Not only does the entire case rest on the outcome of the evidence stage, but the courts also spend a considerable amount of time in cross-examining witnesses. Summoning the investigating officer and witnesses, carrying out examination in chief and cross-examination, recording the statement of accused under Section 313 of the CrPC, and so on occupy a significant amount of time of the court.
Figure 4.2.8 broadly shows the different stages in a civil case.
figure 4.2.8.Stages in Civil Cases

Apart from the stages provided in Figure 4.2.8, there are few other stages, such as alternate dispute resolution (ADR), which can be sought by the parties in case they are interested in out-of-court settlements. Even in civil cases, not all the cases go through the evidence stage. Cases relating to execution or appeals do not have an evidence stage. Also, where the suit is dismissed for want of cause of action or where parties settle the dispute through ADR, the case does not go through the evidence stage.
Hence, for the purpose of carrying out a comprehensive analysis, cases may be bifurcated into those that go through the evidence stage and those that do not.
Assessing the Life cycle of Cases through Survival Analysis
While forecasting the number of cases filed would help judges and the registry in taking strategic decisions, a model that can help in assessing the life of a case and providing estimations for disposal of cases would be helpful in gaining a deeper understanding of the court’s workload. To provide estimations regarding the disposal of cases, the chapter uses survival analysis to assess the life cycle of cases in Mysuru. A similar study aimed at deconstructing the duration of cases through survivor analysis was carried out for civil cases filed in certain courts in Argentina.15
Survival analysis is a set of statistical techniques used to study the data where ‘the outcome variable is the time until the occurrence of an event of interest’.16 Such an event can be death, marriage, or occurrence of a disease. The time it takes to the event, or in other words the survival time, can be determined in years, months, weeks, etc.17 For example, if the event of interest is a heart attack, then survival analysis can help in providing information about the time taken for the person to develop the heart attack.18 Hence, the subject is observed over a certain period of time until the event of interest occurs.
Survival analysis is widely used in biology or health-related areas where researchers face the issue of incomplete observations.19 For instance, imagine a study is being undertaken for 10 weeks to understand the survival time of individuals developing a heart attack.20 During the study period, let us say that some patients develop a heart attack, while certain other patients drop out of the study, and some do not develop a heart attack at all. To avoid discarding incomplete or partial data, survival analysis can be used to provide an estimate of when the event may occur.21 The completed observations, that is patients who developed a heart attack, are called ‘non-censored’, while observations that remained incomplete, that is patients who either dropped out or never developed the heart attack, are called ‘censored’.22
Similarly, cases filed in the court can also be bifurcated into non-censored, that is disposed cases which have been concluded in the court, and censored, that is cases that are still pending in the court and are yet to be concluded. Based on the data, the survivor model techniques can be used to determine case duration or survival time.
In this chapter, we use survival analysis (Kaplan- Meier Product Limit Method; see Annexure B for details) to estimate the probability of case disposals.
Applying Survivor Analysis to Criminal Cases (CC)
Criminal cases differ from each other in nature and complexity; hence, analysing all criminal cases together may not provide accurate results. In addition, carrying out separate analyses for each different type of case would result in numerous methods and insights, which would be difficult to understand and present. Hence, drilling down to a particular case type would be a good start to showcase the results. Accordingly, the case type Criminal Cases (CC) that forms the majority in the sample study has been considered for survivor analysis in this chapter.
Criminal Cases are those that originate from the police station where police officials file a chargesheet upon registration of a First Information Report. Criminal Cases pertain to offences in the Indian Penal Code, protection of women from domestic violence, mines and minerals, etc. Criminal Cases are primarily adjudicated by judicial magistrates. To carry out the survival analysis, cases were divided into those that undergo the evidence stage, and those that do not, as explained earlier.
In our study, amongst the cases that did not go through the evidence stage, the nature of disposal in 32 per cent of the cases was ‘closed’. In 17 per cent of cases, the nature of disposal was ‘uncontestedconviction or conviction by pleading guilty’. Table 4.2.4 shows the probability of cases surviving or pending over different points in time after being filed.
Table 4.2.4 compares the probability score of cases that will remain pending at the end of each year over a 10-year timeframe for cases that go through evidence and the ones that do not. Out of 83,284 Criminal Cases in the sample, 48,124 cases did not have an evidence stage at all. As per Table 4.2.4, one may note that the probability of cases remaining pending is higher for different time periods that go through the evidence stage,
figure 4.2.4.Probability Score of Cases Pending for Criminal Cases (CC)

as opposed to cases that do not go through the evidence stage. Hence, if a case does not go through the evidence stage, then its probability of getting disposed quickly is higher, if all else is equal. Cases that do not go through the evidence stage have a 64 per cent probability of remaining pending for one year or more. That means the probability of cases getting disposed in the same year as they were filed in is 36 per cent. However, for cases that do go through the evidence stage, the probability of them pending for one year or more is 83 per cent. Thus, there is a 17 per cent probability that such cases would be disposed within the same year as that of filing. The trend is similar across different time periods, showing that the probability of a case getting disposed faster is high when it does not go through the evidence stage.
Applying Survivor Analysis to Civil Cases
The nature of civil cases filed in Mysuru district differs widely. To carry out survival analysis, it was important to target certain case types. Hence, based on the number of cases in the sample study, survival analysis was carried out for Original Suit (OS), Execution Cases (EX), Motor Vehicle Cases (MVC), Small Cause Case (SC), Regular Appeals (RA), and Land Acquisition Cases (LAC). These case types form a considerable load of cases in Mysuru district. In our study, amongst the cases that do not go through the evidence stage, 38 per cent of them indicated nature of disposal as ‘closed’, followed by 24 per cent of cases which were ‘compromised’. Table 4.2.5 shows the probability of cases that remain pending for different case types across various points in time.
figure 4.2.5.Probability Score of Civil Cases that Go through the Evidence Stage

figure 4.2.6.Probability Score of Civil Cases that Do Not Go through the Evidence Stage

Tables 4.2.5 and 4.2.6 compare the probability scores of cases that go through evidence and the ones that do not. Cases categorised as EX and RA that do not go through the evidence stage at all have been included only in Table 4.2.6. Similar to CC, LAC have a higher probability score for cases that go through the evidence stage, indicating the fact that the probability of cases remaining pending is higher when compared to the cases that do not go through evidence, if all else is equal. Amongst LACs, which do not go through the evidence stage, the probability of cases pending for one year or more is 39 per cent, as more than half of the cases would probably be disposed in the same year as they were filed.
A problematic case type is MVC, where the probability of cases remaining pending even after two or three years is higher for cases that do not go through the evidence, when compared to ones that do. The evidence stage occupies a considerable amount of time and effort of the court. If despite not going through the evidence stage, cases take longer to be disposed, then there are certain other stages that may be causing delay in a case. A similar pattern can also be seen for SC cases, where the probability of cases pending at different time intervals is higher for cases that do not go through the evidence stage.
The probability of EX cases pending for one year or more is 68 per cent, with the probability of EX cases pending for six years or more is 22 per cent. This is problematic too. EX cases relate to the execution of the decree passed by the court which means litigants have to approach the court yet again for the same case and wait for several years till the execution order is passed.
Correlation for Criminal Cases
There are various stages through which criminal cases progress. Each stage has its own significance in a criminal case. To get a better and in-depth understanding of the life cycle of a case, it becomes essential to assess the effects of different stages on a case. For this purpose, each of the stages of a criminal case has been correlated with the days taken to dispose a case. In a nutshell, correlation helps in understanding the strength of a relationship between two variables. The closer the value is to 1, the stronger the relationship between the two variables. To better understand case life, each of the stages have been correlated with the days taken to dispose a case. This would help in highlighting the relationship between stages of cases and the disposal timeframe of cases.
To carry out this correlation, several stage attributes are chosen as different variables. Table 4.2.7 depicts the variables with the most amount of correlation with disposal time.
Table 4.2.7 focuses on two attributes of stages, that is days spent on a stage in the life cycle of a case and the number of hearings at each of the stages.23 These attributes are compared with cases that go through the evidence stage and ones that do not. These stage attributes are then correlated with the case life to see which attribute has the highest correlation. As per Table 4.2.7, the number of hearings in a case has a positive (directly proportional) and strong correlation with case life for both categories of cases, which means an increase in the number of hearings in a case would increase the number of days to dispose a case and vice versa. This is certainly obvious since greater number of hearings mean that the case is prolonged and unable to get disposed. Adjournments can play a detrimental role in unnecessarily extending the number of hearings, and thus prolonging case life. Due to absence of the parties/witnesses or even the judge, hearings are pushed to succeeding dates, thus causing delay.
Days spent on the evidence stage has a high correlation with case life, which indicates that the speed with which cases get decided at the evidence
figure 4.2.7.Correlation of Different Case-Related Variables (Cases with Evidence and No Evidence Stage)

stage has a strong impact on the disposal of cases. The faster a case goes through the evidence stage, the sooner the case can be completed, provided there is no other obstacle faced at any other stage.
Further, days spent on the stage of issuing notices/ summons/warrants and the appearance stage have a moderately high correlation with case life for both the categories of cases. However, cases not going through the evidence stage have a slightly stronger correlation. In certain cases, a great amount of time is spent on issuing summons to witnesses or issuing warrants to the accused. This can further hamper the day-to-day proceedings of the case. Despite consistent issue of summons and warrants, the accused or the witnesses fail to appear that results in delay. The correlation between days spent on the appearance stage and disposal time is 0.649 for cases that do not go through the evidence stage, which shows that court may be spending several days on the appearance stage, waiting for the accused/parties to appear.
Certain other stages, such as framing of charges and hearing stage, also feature in Table 4.2.7; however, they have a very weak correlation with case life.
To obtain a deeper understanding, it is important to go a step further in analysing this aspect. Since there are different types of cases which have been disposed at different times, a bifurcation between them would further help in giving a proper picture. In this context, Figure 4.2.9 focuses on the case type CC, as it forms a majority amongst the case types in Mysuru. Further, Figure 4.2.9 considers the evidence stage as a landmark and divides the cases into three phases, that is pre-evidence stage, evidence stage, and post-evidence stage. The pace at which cases move from these phases is worth analysing.
Figure 4.2.9 splits up cases into different disposal time brackets, that is 0–2 years, 2–5 years, 5–10 years, and more than 10 years. The average days between hearings in the pre-evidence, evidence, and post-evidence stages are compared across the different time brackets. It may be noted that the days between hearings increase in three phases as the case starts getting older. For cases in the 2–5 years and 5–10 years bracket, the days between hearings
figure 4.2.9.Average Days between Hearings in Different Phases for Criminal Cases (CC) in Various Disposal Time Brackets

at the pre-evidence phase is higher when compared to those in the other phases. It is, therefore, required that judges complete the pre-evidence phase, comprising cognisance, framing of charges, etc. in a timely manner and proceed with the evidence stage to ensure that cases get disposed in a timely manner.
Further, it can be seen that cases that take more than 10 years to be disposed have a much higher average of days between hearings in the post-evidence phase. Hence, even after the entire evidence phase is completed, there is delay at the final stages of the case, thus preventing the case from getting disposed.
Correlation for Civil Cases
Table 4.2.8 highlights the correlation of various stages with the disposal time of cases that go through the evidence stage and the ones that do not.24>
Table 4.2.8.Correlation of Different Case-Related Variables (Evidence and No Evidence Stage) Brackets

The measure of days spent on the evidence stage has a very high correlation to disposal time. A higher positive correlation indicates that with an increase in the days spent on the evidence stage, there is an increase in the days taken to dispose the case, and vice versa. Hence, the evidence stage plays a crucial role in the life cycle of civil cases. However, cases that do not go through the evidence stage have a high correlation with days spent on notice and summons stage. A similar trend was observed even for criminal cases; however, the correlation for civil cases is found to be much stronger. Therefore, the notice and summons stage plays an important role for cases that do not go through the evidence stage. Often parties, especially the defendant, do not appear in the court despite repeated summons. Due to their absence during the hearings, cases continue to get adjourned and result in unnecessary delays.
Days between hearings and the days taken to dispose cases have a strong correlation to each other. The correlation is strong for both categories of cases— those that go through evidence and those that do not. Listing cases at regular intervals of time is crucial so that cases can progress through different stages quickly. The correlation highlights that days between hearings is all the more important for civil cases, as increase in the number of days would lead to increase in the time taken to dispose that case and vice versa.
Conclusion
The various analyses, insights, and statistical models provided in this chapter have been devised with the intention of helping judges at the ground level. One of the highlights of the chapter is that it predicts the number of cases that will be filed in the next few years in Mysuru. The forecasting model can be used by the principal district judge and the registry members to strategise and adopt certain measures that can be used to deal with the incoming case load. Based on the projected filings, the administration can take steps to ensure that there is adequate staff strength during months when the greatest number of filings is expected. One of the most important uses of the model can be at the national level, wherein number of expected filings in different states and districts in the country can be projected. The model can be used nationally by absorbing it into the new version of the NJDG that was released recently and provides several interactive charts and figures in relation to cases pending/disposed in courts. While the projection provided in this chapter focuses only on cases filed, the same can be expanded to even project the number of cases pending in a particular court over the next few years. Hence, the forecasting model can play an important role in managing the workload of judges in different courts.
The chapter also takes a deeper look into the life cycle of a case and studies the extent to which various variables affect the disposal of cases. While carrying out the analysis, cases that went through the evidence stage and the ones that did not go through the evidence stage were analysed separately. Overall it was observed that cases that do not go through the evidence stage tend to get disposed faster. In civil cases, it was observed that the average days spent on evidence stage, the average days between hearings, and the average days spent on notice/summons/ LCR have a high correlation with disposal of cases. Judges need to ensure that minimum adjournments are granted, and cases be listed in quick intervals wherever possible so that all the stages get completed quickly. Further, there is a strong need to bifurcate cases on procedural and substantive stages in courts. Judges need to focus on substantive stages while the procedural stages should be taken care of by the registrar of the court, thus freeing up judges’ time. The practice is mandated under the case flow management rules passed in several states; however, it is seldom being followed.
On the other hand, in criminal cases, the average number of hearings in a case and the average days spent on notice/summons/warrants stage have a strong correlation with disposal of cases. Hence, judges should ensure that minimum adjournments are granted to ensure maximum number of effective hearings in a case. Summoning of accused, witnesses, or investigation officers should take place quickly so that proceedings continue smoothly.
It has been close to seven years since the inauguration of the e-courts online portal. Significant amounts of data have been gathered that can be used to suggest systemic changes in court processes. With increasing growth and enhancement of the e-courts system, more case-related data is made available which can be used to carry out various types of analyses. However, not all judges use data and technology at the optimal level. Bringing more awareness towards effective use of data and encouraging judges to bring changes based on data insights is important. It is high time that data becomes an integral part of the judiciary, as it can go a long way in bringing concrete changes in the system.
Annexure A: Holt-Winters method
Holt-Winters model is used to forecast and predict the behaviour of a sequence of values over different periods of time, also known as a time series model. Holt-Winters model is one of the most popular and widely used forecasting technique for carrying out time series. The model is decades old; however, it is still used in various fields and in many applications.25
Forecasting always requires a model and through Holt-Winters three aspects of time series can be modelled. These are a typical value (average), a slope (trend) over time, and a cyclical repeating pattern (seasonality).26 Holt-Winters uses an exponential smoothing to encode several values from the past and use them to predict ‘typical’ values for the present and future. Before proceeding further, it is important to explain exponential smoothing. The terms refer to the use of exponentially weighted moving average (EWMA) to ‘smooth’ a time series. If you have a time series, xt, you can define a new time series st that is a smoothed version of xt.27

There are three aspects to the time series model, that is value, trend, and seasonality. These three aspects are expressed as three types of exponential smoothing; hence, the Holt-Winters model is called as the triple exponential smoothing. The model uses the combined effect of these three influences and computes the prediction for a current or future value. The model requires several parameters: one for each smoothing (α, β, γ), the length of a season, and the number of periods in a season.28
Suppose you have a time series, xt, along with a smoothed version st. You would like to forecast the next value for xt, which is xt +1. To do this, you could use the last value you calculated for the EWMA, st. It works out this way because our smoothed time series is the EWMA of our original series, and because of the way averages (and expectations) work, st turns out to be a really good prediction. Predicting the next value is called the one-step-ahead forecast.
Further, in this chapter, we have used the Holt- Winters additive model to forecast the future case filings.
The component form for the additive method is:29

where, k is the integer part of (h−1)/m, which ensures that the estimates of the seasonal indices used for forecasting come from the final year of the DAKSH.indd 210 11/03/20 3:56 pm 211 Forecasting and Predictive Modell ing sample. The level equation shows a weighted average between the seasonally adjusted observation (yt−st−m) and the non-seasonal forecast (ℓt−1+bt−1) for time t. The trend equation is identical to Holt’s linear method. The seasonal equation shows a weighted average between the current seasonal index, (yt−ℓt−1− bt−1), and the seasonal index of the same season last year (that is, m time periods ago).30
The equation for the seasonal component is expressed as:

If we substitute ℓt from the smoothing equation for the level of the component form above, we get:

which is identical to the smoothing equation for the seasonal component we specify here, with γ=γ∗(1−α). The usual parameter restriction is 0≤γ∗≤1, which translates to 0≤γ≤1−α.31
in depicting the estimated probability of survival at each recorded time of failure in the data. The Page 211 - top of right seiqdueat icoonl ufomr Kna -p lpalne-Maseeie ur seset imthaitso rv eisr35sion

where ti is duration of study at point i, di is number of cases disposed up to point i and ni is the number of cases pending just prior to ti. S is based upon the probability that a case will remain pending at the end of a time interval, on the condition that the case was filed at the start of the time interval. S is the product (P) of these conditional probabilities.
Notes
- Hindustan Times. 2019. ‘IIT-Madras Offers AI-based Tech to Help Army Predict Stone Pelting in Jammu and Kashmir’, Hindustan Times, 23 January, available online at https://www.hindustantimes.com/indianews/iit-madras-offers-ai-based-tech-to-help-army-predict-stone-pelting-in-jammu-and-kashmir/story-nFpvW68m42WNLXmu2AnIBN.html (accessed on 1 February 2019).
- E-courts services, available online at http://www.ecourts.gov.in/ecourts_home/static/about-us.php (accessed on 28 January 2019).
- As per e-courts, judges belonging to the same cadre are grouped together as court establishments.
- Due to certain technical problems in extracting data for two court establishments, data from all 21 court establishments could not be extracted.
- Arunav Kaul, Ahmed Pathan, and Harish Narasappa. 2017. ‘Deconstructing Delay: Analyses of Data from High Courts and Subordinate Courts’, Approaches to Justice in India: A Report by DAKSH, 2: 94–95.
- Martin Zarcek. 2017. Introduction to Time Series. Czech Republic: University of Ostrava, p. 1.
- Abel Matus Castillejos. 2006. ‘Management of Time Series Data’, University of Canberra Australia, p. 1, available online at https://rafalab.github.io/pages/754/section-10.pdf (accessed on 3 February 2019).
- Zarcek, ‘Introduction to Time Series’, p. 1.
- William F. Shughart and Gokhan Karahan. 2003. ‘A Study of the Determinants of Case Growth in U.S. Federal District Courts’, Report to the U.S Department of Justice, p. 1–2, available online at https://www.ncjrs.gov/pdffiles1/nij/grants/204010.pdf (accessed on 1 February 2019).
- Data for all the years has been taken from National Judicial Data Grid as of 28 January 2019.
- Marcel Dettling. 2014. ‘Applied Time Series Analysis’, Zurich University of Applied Sciences, p. 3, available online at https://stat.ethz.ch/education/semesters/ss2014/atsa/Scriptum_v140523.pdf (accessed on 5 February 2019).
- Dettling, ‘Applied Time Series Analysis’, p. 3.
- Dettling, ‘Applied Time Series Analysis’, p. 4.
- While there are different time series models to project data, the chapter uses the Holt-Winters model to forecast the case filings. The model is easy to understand and provides accurate future projections.
- Vargas Perez, Carmen, and Penaloza Figueroa. 2017. ‘The Duration of Civil Cases. A Survival Analysis’, Revista Perspectivas 1(1): 47–59, available online at https://eprints.ucm.es/42178/1/The%20Duration%20of%20Civil%20Cases_A%20survival%20analysis.%20Vargass%20y%20Pe%C3%B1aloza%20%281%29.pdf (accessed on 3 February 2019).
- Simona Despa. ‘What is Survival Analysis?’, Cornell Statistical Consulting Unit, Cornell University, p. 1, available online at https://www.cscu.cornell.edu/news/statnews/stnews78.pdf (accessed on 5 February 2019).
- Despa, ‘What is Survival Analysis?’, p. 1.
- Despa, ‘What is Survival Analysis?’, p. 1.
- Perez et al., ‘The Duration of Civil Cases. A Survival Analysis’, p. 48.
- Despa, ‘What is Survival Analysis?’, p. 1.
- Perez et al., ‘The Duration of Civil Cases. A Survival Analysis’, p. 48.
- Despa, ‘What is Survival Analysis?’, p. 1.
- Only disposed cases have been considered in the correlation. Cases that have been disposed in only one hearing have not been considered in the correlation.
- Only disposed cases have been considered in the correlation. Cases that have been disposed in only one hearing have not been considered in the correlation.
- Preetam Jinka. 2018. ‘Holt-Winters Forecasting Simplified’, available online at https://www.vividcortex.com/blog/holt-winters-forecasting-simplified (accessed on 22 April 2019).
- Preetam Jinka. 2017. ‘Exponential Smoothing for Time Series Forecasting’, available online at https://www.vividcortex.com/blog/exponential-smoothing-for-timeseriesforecastin (accessed on 22 April 2019).
- Preetam Jinka, ‘Exponential Smoothing for Time Series Forecasting’.
- Preetam Jinka, ‘Holt-Winters Forecasting Simplified’.
- Rob J. Hyndman and George Athanasopoulos. 2018. Forecasting: Principles and Practice, 2nd edition, available online at https://otexts.com/fpp2/ (accessed on 23 April 2019).
- Hyndman and Athanasopoulos, Forecasting.
- Hyndman and Athanasopoulos, Forecasting.
- Despa, ‘What is Survival Analysis?’, p. 1.
- Despa, ‘What is Survival Analysis?’, p. 1.
- Despa, ‘What is Survival Analysis?’, p. 1.
- Edward L. Kaplan and Paul Meier. 1958. ‘Nonparametric Estimation from Incomplete Observations’, Journal of the American Statistical Association, 53(282): 457–481.