Produce monolog messages through kafka+avro
authorErik Bernhardson <ebernhardson@wikimedia.org>
Tue, 4 Aug 2015 18:02:47 +0000 (11:02 -0700)
committerErik Bernhardson <ebernhardson@wikimedia.org>
Mon, 21 Sep 2015 19:45:23 +0000 (12:45 -0700)
commitf66559b616f47c35dcfe464fe61835c0cdcd7591
tree3dd36b8b67a2f5da6ebc2d4a0102798c0a8c548b
parent22c163326fc7ea711332b7ae2870ea802d228b77
Produce monolog messages through kafka+avro

This allows a logging channel to be configured to write
directly to kafka. Logs can be serialized either to json
blobs or the more compact apache avro format.

The Kafka handler for monolog needs a list of one of more
kafka servers to query cluster metadata from. This should be
able to use any monolog formatter, although some like
JsonFormatter require you to disable formatBatch as Kafka
protocol would prefer to encode each record independently in
the protocol.  This requires the nmred/kafka-php library,
version >= 1.3.0.

Adds a new formatter which serializes to the apache avro
format. This is a compact binary format which uses pre-
defined schemas. This initial implementation is very simple
and takes the plain schemas as a constructor argument.

Adds a new option to MonologSpi to wrap handlers in a
BufferHandler. This doesn't flush until the request shuts
down and prevents any network requests in the logger from
adding latency to web requests.

Related mediawiki/vendor update: Ibfe4bd2036ae8e998e2973f07bd9a6f057691578

The necessary config is something like:

array(
    'loggers' => array(
        'CirrusSearchRequests' => array(
            'handlers' => array( 'kafka' ),
        ),
    ),
    'handlers' => array(
        'kafka' => array(
            'factory' => '\\MediaWiki\\Logger\\Monolog\\KafkaHandler::factory',
            'args' => array( 'localhost:9092' ),
            'formatter' => 'avro',
            'buffer' => true,
        ),
    ),
    'formatters' => array(
        'avro' => array(
            'class' => '\\MediaWiki\\Logger\\Monolog\\AvroFormatter',
            'args' => array(
                array(
                    'CirrusSearchRequests' => array(
                        'type' => 'record',
                        'name' => 'CirrusSearchRequests'
                        'fields' => array( ... )
                    ),
                ),
            ),
        ),
    ),
)

Bug: T106256
Change-Id: I6ee744b3e5306af0bed70811b558a543eed22840
autoload.php
composer.json
includes/debug/logger/MonologSpi.php
includes/debug/logger/monolog/AvroFormatter.php [new file with mode: 0644]
includes/debug/logger/monolog/BufferHandler.php [new file with mode: 0644]
includes/debug/logger/monolog/KafkaHandler.php [new file with mode: 0644]
includes/utils/AvroValidator.php [new file with mode: 0644]
tests/phpunit/includes/ConsecutiveParametersMatcher.php [new file with mode: 0644]
tests/phpunit/includes/debug/logger/monolog/AvroFormatterTest.php [new file with mode: 0644]
tests/phpunit/includes/debug/logger/monolog/KafkaHandlerTest.php [new file with mode: 0644]
tests/phpunit/includes/utils/AvroValidatorTest.php [new file with mode: 0644]