In previous article (Agile teams reporting using Jira API and R language – part 1) I described how to automatically download data from Jira using its REST API, parse it and export into csv file for future use. When you have data it’s time to visualize it! There are many options to visualize table format data. Usually first option is Excel. I chose R language for couple of reasons:

  1. Data manipulation is much easier in R
  2. Project is stored in text format at this allows you to store it easily in versioning system (like git)
  3. It’s reusable
  4. It’s free

And what’s most important

  1. R produces beautiful charts!

I’d like to create chart that will show us how many bugs were reported and fixed by team on weekly basis. This information will be quite useful to observe backlog dynamics, analyze trends and react quickly in case of potential problems. Additionally I’d like to see what is the status of the backlog from week to week – if defects are timely resolved or their number is growing.

Data manipulation

I used R Studio to create this example. You can find it here: https://www.rstudio.com/

Importing data is very straight forward. For dataset produced in previous article I used following command in my R project:

bugs <- read.csv( file="bugs.csv", head=TRUE,sep=",", dec=",", stringsAsFactors=FALSE)

We need to group defects by week so we should add information about week when issue was created to the dataset. This is done by the line below:

bugs [, "CreatedWeek"] <- as.Date(cut(as.Date(bugs $created), breaks = "week"))

To simplify calculations I added Quantity column with the value of one to the dataset.

bugs [, "Quantity"] <- 1

Now it’s time to aggregate issues by CreatedWeek and team fields and sum Quantity field.

createdIssuesByTeamAggr <- aggregate(x=createdOnly$Quantity, by=list(createdOnly$CreatedWeek, createdOnly$team), sum)

After aggregation R will name columns in new table automatically. For future use and better readability I recommend renaming columns with following commands:

colnames(createdIssuesByTeamAggr)[1] <- "week"
colnames(createdIssuesByTeamAggr)[2] <- "team"
colnames(createdIssuesByTeamAggr)[3] <- "opened"

After all those operations you should see something like this in R Studio:
createdIssuesByTeamAggrNow it is time to manipulate data to aggregate information about completed issues. In this project issue is completed when it is passed to Closed or Testing Completed state.

completed <- bugs[bugs$status %in% c("Closed", "Testing completed"),]

Again we would like to group completed issues by week so we should add column with week information to dataset.

completed [, "completedWeek"] <- as.Date(cut(as.Date(bugs $updated), breaks = "week"))

This is simplified approach because I’m assuming that the issue is not updated (commented on, fields data changed after it is passed to final state) but for our example it should work fine.

Now we have all information needed to aggregate completed items by week and team. We can do it in the same way as for open issues. Additionally for better readability I recommend to sort issues by completion week. Below commands that will do required job:

completedByTeamAggr <- aggregate(x= completed $Quantity, by=list(completed $ completedWeek, completed $team), sum)
colnames(completedByTeamAggr)[1] <- "week"
colnames(completedByTeamAggr)[2] <- "team"
colnames(completedByTeamAggr)[3] <- "closed"
completedByTeamAggr <- completedByTeamAggr [order(completedByTeamAggr $ week), ]

Again, it is always recommended to view the modified dataset to ensure that everything looks as expected. After using View(comepletedByTeamAggr) command you should see following in R Studio:

completedByTeamAggrIt’s time to merge above datasets. It can be done using merge command and specifying merge fields, in this case week & team:

createdVsClosedbyTeam <- merge(createdIssuesByTeamAggr, finishedIssuesByTeamAggr, by=c("week", "team"))

To observe a trend we should add column that will show difference between closed & reported issues. This information will be stored in column named left:

createdVsClosedbyTeam[, "left"] <- as.integer(createdVsClosedbyTeam$opened) - as.integer(createdVsClosedbyTeam$closed)

Adding column with cumulative sum of left issues to track trend in time. First we should filter dataset by team and sort rows by week to get correct results:

createdVsClosedOneTeam <- createdVsClosedbyTeam[createdVsClosedbyTeam$team == team, ]
createdVsClosedOneTeam <- createdVsClosedOneTeam[order(createdVsClosedOneTeam$week), ]
createdVsClosedOneTeam[, "cumSumLeft"] <- cumsum(createdVsClosedOneTeam$left)

Below you can find the dataset after all changes in R Studio:

createdVsClosedOneTeamData visualization

Dataset is ready so it’s high time for visualization. Let’s add simple graph with points for opened and closed defects.

ggplot(createdVsClosedOneTeam, aes(createdVsClosedOneTeam$week)) +
geom_point(aes(y=createdVsClosedOneTeam$opened, colour ="reported bugs")) +
geom_point(aes(y=createdVsClosedOneTeam$closed, colour ="closed bugs"))+
scale_colour_manual("Lines", values=c("closed bugs"="green", "reported bugs"="red"))

point plot in R

point plot in R

It is not very meaningful, is it? Those points are quite easy to miss. We can add lines between them for better tracking (added commands are marked blue).

plot <- ggplot(createdVsClosedOneTeam, aes(createdVsClosedOneTeam$week)) + # basic graphical object
geom_point(aes(y=createdVsClosedOneTeam$opened, colour ="reported bugs")) +
geom_point(aes(y=createdVsClosedOneTeam$closed, colour ="closed bugs"))+
scale_colour_manual("Lines", values=c("closed bugs"="green", "reported bugs"="red"))+
geom_line(aes(y=createdVsClosedOneTeam$opened, colour ="reported bugs")) +
geom_line(aes(y=createdVsClosedOneTeam$closed, colour ="closed bugs"))

line plot in R with data points

Line plot in R with data points

Looks much better. We need to add cumulative sum of defects that are left. As you can see above parts of the graph are added incrementally, one layer after another. I would not like to cover lines so command adding cumulative sum bars is added to plot earlier than points and lines (addition marked blue).

plot <- ggplot(createdVsClosedOneTeam, aes(createdVsClosedOneTeam$week)) + # basic graphical object
geom_bar(stat="identity", aes(x = createdVsClosedOneTeam$week, y = createdVsClosedOneTeam$cumSumLeft), fill="white", colour="red") +
geom_point(aes(y=createdVsClosedOneTeam$opened, colour ="reported bugs")) +
geom_point(aes(y=createdVsClosedOneTeam$closed, colour ="closed bugs"))+
scale_colour_manual("Lines", values=c("closed bugs"="green", "reported bugs"="red"))+
geom_line(aes(y=createdVsClosedOneTeam$opened, colour ="reported bugs")) + # first layer
geom_line(aes(y=createdVsClosedOneTeam$closed, colour ="closed bugs")) # second layer

Chart with added cumulative bars

Chart with added cumulative bars

Something is still missing there. For wider audience it’s worth to add descriptions on both X and Y axis and some description to the chart.

plot <- ggplot(createdVsClosedOneTeam, aes(createdVsClosedOneTeam$week)) + # basic graphical object
geom_bar(stat="identity", aes(x = createdVsClosedOneTeam$week, y = createdVsClosedOneTeam$cumSumLeft), fill="white", colour="red")+
geom_line(aes(y=createdVsClosedOneTeam$opened, colour ="reported bugs")) + # first layer
geom_point(aes(y=createdVsClosedOneTeam$opened, colour ="reported bugs")) +
geom_line(aes(y=createdVsClosedOneTeam$closed, colour ="closed bugs"))+ # second layer
geom_point(aes(y=createdVsClosedOneTeam$closed, colour ="closed bugs"))+
scale_colour_manual("Lines", values=c("closed bugs"="green", "reported bugs"="red")) +
ggtitle(paste("Bugs reported vs closed (weekly) + cumulative sum of not closed bugs in team:", team, sep = " ")) +
xlab("week")+
ylab("Number of bugs")

Chart with described axis and title

Chart with described axis and title

That’s it! You now have pretty graph visualizing dataset information. It’s only beginning of the road as you can do much more in R. As a homework you can add to the graph information about releases to see what is correlation between reported issues and production deploy like on the graph below.

Chart with added information about release dates

Chart with added information about release dates