Measurement of A/B test Effectiveness with Sixpack and Fluentd

Hello, this is Ainoya(@ainoya) from AP Solution Group at Recruit Technologies.

There is an A/B testing framework called Sixpack which enables us to monitor the measurement of A/B test effectiveness quickly. By combining this and Fluentd, I would like to introduce a case of measurements of effectiveness with Sixpack based on the collected conversion log.

How does everyone monitor an A/B test? What we normally do is to divide out test patterns by embedding js provided by a traffic analysis like SaaS in the front-end. This Sixpack can be flexible about extensibility, although we have to take some time to prepare a set of aggregate server client libraries by ourselves. With this characteristic, Sixpack is suitably worthy of being used in the following cases;

You do not want to embed js provided by a third party in your service.
You want to control test patterns except client-side js.
You want to automate the complicated works of measurement of effectiveness, even though you already have an in-house A/B test module.

Measurements of A/B test effectiveness with Sixpack

Sixpack is an A/B testing frame work developed by SeatGeek, and it enables us to do an A/B test and to measure its effectiveness. By the way, I think probably the name of Sixpack comes from “ab” of A/B test, and this naming senses for software by foreigners impresses me a lot.

Sixpack is written in phython and is formed by two servers. It is good to be able for us to check immediately from the Web dash board the aggregate result of the A/B test piled through Sixpack API server.

API server in order to send/receive the information of A/B test result report with clients.
Dash board display server to visualize the measurement of effectiveness

Sixpack API server

You can use the client library provided in multiple languages, but when you want to implement a server, you can refer to the client specification definition. You can implement it easily because the API itself is simple.

There are two kinds of end points of API as follows, and they communicate themselves through GET requests by giving a parameter such as Client ID. Please refer to README for more details of how to use it.

participate: Get the information(register) of dividing out the A/B test
convert: Register the conversion of the A/B test

Sixpack uses Redis as a data store, in which it archives the measurement information of the A/B test.

When you use it in a high loading environment, you had better consider to use Redis Sentinel instead of multiplying configuring the API server. Sentinel use options are available as Sixpack config. (The operation is unconfirmed.)

Sixpack Web Dashboard

You can have a look at the summary of the measurement of A/B test effectiveness archived through Sixpack API from the Web application Dashboard prepared separately with API server. The sample of the displayed screen can be seen as below; In the summary, you can watch the chronological-order data of the status to issue the divided A/B test and its conversion rate. Besides of those, you can check the results of likelihood ratio test by G test at the same time.

sixpack-web-demo

Deleting Fluentd log to Sixpack and Measure the A/B test effectiveness

I have explained the summary of Sixpack so far. But once you want to use it, you will have some worries like these;

You already have the A/B test modules installed and they compete each other.
You do not want to communicate with unnecessary APIs within applications during managing web service requests.

So, as its alternative plan, leave it to Fluentd for the process, how about you archive the A/B test information from the log to Sixpack? I write Output plugin fluent-plugin-sixpack which throws log data into Sixpack API. With this, I introduce the ways of aggregation of the A/B test information which is actually log-output at Sixpack.

fluent-plugin-sixpack setting

How-to-use it is not so difficult at all. Since this plugin is just to set the log to the parameter requested by the Sixpack API, all you have to do is to write the URL of the Sixpack API server for the minimum-necessary writing of td-agent.conf.

<match app.abtest.log>
  type sixpack
  sixpackapi_url http://localhost:5000/ #sixpack APIのURL
</match>

type sixpack

sixpackapi_url http://localhost:5000/ #sixpack APIのURL

</match>

There are two patterns of log configurations which plugin expects.

# A/Bテストの振り分け参加データ
{"record_type":"participate", # A/Bテスト情報の種別: "A/Bテストの振り分け参加情報"
 "experiment":"header-color",     # A/Bテストのテスト名
 "alternatives":"red,green,blue", # A/Bテストパターンの全選択肢
 "alternative":"red",             # A/Bテストで選択された値
 "client_id":"ID-0000-0001"}      # A/Bテスト参加者の識別ID

# A/Bテストのコンバージョン達成データ
{"record_type":"convert", # A/Bテスト情報の種別: "A/Bテストの振り分け参加情報"
 "experiment":"header-color",     # A/Bテストのテスト名
 "client_id":"ID-0000-0001"}      # A/Bテスト参加者の識別ID

# A/Bテストの振り分け参加データ

{"record_type":"participate", # A/Bテスト情報の種別: "A/Bテストの振り分け参加情報"

"experiment":"header-color", # A/Bテストのテスト名

"alternatives":"red,green,blue", # A/Bテストパターンの全選択肢

"alternative":"red", # A/Bテストで選択された値

"client_id":"ID-0000-0001"} # A/Bテスト参加者の識別ID

# A/Bテストのコンバージョン達成データ

{"record_type":"convert", # A/Bテスト情報の種別: "A/Bテストの振り分け参加情報"

"experiment":"header-color", # A/Bテストのテスト名

"client_id":"ID-0000-0001"} # A/Bテスト参加者の識別ID

By meeting the Sixpack API specifications, when you use it, you need to make the log configurations to the specifications above through using fluent-plugin-record-modifier and fluent-plugin-parser. Config is available that you can change the key name at the plugin, make sure what options you can use in the README.

Try to combine Sixpack and Fluentd

Dockerfile is available that are set in order for you to easily try the mechanism of the measuring A/B test effectiveness with Fluentd and Sixpack. Please feel free to use it with the following examples as I post the built image at DockerHub.

Activation of Docker container

docker pull ainoya/sixpack
docker run -t --name sixpack-server
            -p 49022:49022 -p 24224:24224 -p 5000:5000
            -p 5001:5001 sixpack-server

docker pull ainoya/sixpack

docker run -t --name sixpack-server

-p 49022:49022 -p 24224:24224 -p 5000:5000

-p 5001:5001 sixpack-server

Throwing Log

## パスワード"sixpac"でコンテナにSSHログイン
ssh -o UserKnownHostsFile=/dev/null
    -o StrictHostKeyChecking=no root@localhost -p 49022

## A/Bテスト参加ログ投入
/opt/td-agent/embedded/bin/fluent-post -t 'sixpack.test'
          -v 'record_type=participate' -v 'experiment=header-color'
          -v 'alternatives=red,green,blue' -v 'alternative=red'
          -v 'client_id=id'

## A/Bテストコンバージョンログ投入
/opt/td-agent/embedded/bin/fluent-post -t 'sixpack.test'
          -v 'record_type=convert' -v 'experiment=header-color'
          -v 'client_id=id0001'

## パスワード"sixpac"でコンテナにSSHログイン

ssh -o UserKnownHostsFile=/dev/null

-o StrictHostKeyChecking=no root@localhost -p 49022

## A/Bテスト参加ログ投入

/opt/td-agent/embedded/bin/fluent-post -t 'sixpack.test'

-v 'record_type=participate' -v 'experiment=header-color'

-v 'alternatives=red,green,blue' -v 'alternative=red'

-v 'client_id=id'

## A/Bテストコンバージョンログ投入

/opt/td-agent/embedded/bin/fluent-post -t 'sixpack.test'

-v 'record_type=convert' -v 'experiment=header-color'

-v 'client_id=id0001'

Watching data update in the Web Dashboard screen of Sixpack

open http://localhost:5001/

1	open http://localhost:5001/

Summary

I talked about the way to measure the A/B test effectiveness from log data through A/B testing framework Sixpack and Fluentd. By calling Sixpack API directly from the client library and by substituting the API request by deleting the log to Fluentd,you can introduce the measurement of A/B test effectiveness as you minimize the change of web applications installed. For further use, as this log data enables us to measure the A/B test effectiveness, I can think of something interesting if I can AB-test each and every interface such as these.

A/B test of API beyond the field of WebUI
A/B test of middle wear infrastructure beyond the field of Web application