skynetservices/skynet-archive

MongoLogger as a skynet service

Closed this issue · 12 comments

to remove hard mongodb dependency from skynet, and to use skynet to keep the logger up itself, let's make the mongologger a wrapper to a skynet client, that sends it's requests through skynet rpc to a distributed mongo logger service that does the actual interaction with mongodb

and make the service not dependent on a specific logging engine - make it an RPC service that does Log4J type stuff... so we can set up file logging, mongodb logging, etc.

@bketelsen How much of the Log4J type stuff do you have in mind that log4go does or doesn't do? http://code.google.com/p/log4go/

I like the idea of log4go

not a big fan of mongo and plan to remove skynets dependency on it with an
ORM once I get up to speed on skynet development. Seems to me people should
be able to pick their db backend skynet shouldnt vendor lockin. For
instance for logging an elasticsearch cluster is much better than mongo,
the slicing and dicing capabilities and the ease of scale make it a perfect
fit for logging, it really removes the need for stuff like graphite as well
because you can simply setup a JS frontend using stuff like:

http://code.shutterstock.com/rickshaw/
http://square.github.com/crossfilter/

with elastic searches capabilities these make for an amazing realtime
stats, logs, analytics platform... But to each is its own which is why
mongo needs to be tied to ORM so people can choose, memsql, mongo,
elasticsearch, redis, or whichever makes sense for the platform they are
building.

there is also stathat.com as well... so really the mongo decision seems to
be pushing skynet away from a platform that works for everyone.

On Wed, Sep 12, 2012 at 10:23 PM, Steve Phillips
notifications@github.comwrote:

@bketelsen https://github.com/bketelsen How much of the Log4J type
stuff do you have in mind that log4go does or doesn't do?
http://code.google.com/p/log4go/


Reply to this email directly or view it on GitHubhttps://github.com/bketelsen/skynet/issues/99#issuecomment-8518577.

to add check out this talk: http://vimeo.com/44716955

On Wed, Sep 12, 2012 at 10:37 PM, Stephen Major smajor@gmail.com wrote:

I like the idea of log4go

not a big fan of mongo and plan to remove skynets dependency on it with an
ORM once I get up to speed on skynet development. Seems to me people should
be able to pick their db backend skynet shouldnt vendor lockin. For
instance for logging an elasticsearch cluster is much better than mongo,
the slicing and dicing capabilities and the ease of scale make it a perfect
fit for logging, it really removes the need for stuff like graphite as well
because you can simply setup a JS frontend using stuff like:

http://code.shutterstock.com/rickshaw/
http://square.github.com/crossfilter/

with elastic searches capabilities these make for an amazing realtime
stats, logs, analytics platform... But to each is its own which is why
mongo needs to be tied to ORM so people can choose, memsql, mongo,
elasticsearch, redis, or whichever makes sense for the platform they are
building.

there is also stathat.com as well... so really the mongo decision seems
to be pushing skynet away from a platform that works for everyone.

On Wed, Sep 12, 2012 at 10:23 PM, Steve Phillips <notifications@github.com

wrote:

@bketelsen https://github.com/bketelsen How much of the Log4J type
stuff do you have in mind that log4go does or doesn't do?
http://code.google.com/p/log4go/


Reply to this email directly or view it on GitHubhttps://github.com/bketelsen/skynet/issues/99#issuecomment-8518577.

@exsys Hmm having one ORM for the many kinds of fundamentally kinds of databases (e.g., Redis and MongoDB) seems hard to do without just covering their greatest common denominator, which is probably key/value pairs. I've definitely thought it'd be cool to have an ORM for all, say, document-oriented DBs (like Mongo and Couch), then another one for all columnar DBs (e.g., HBase and Hypertable), etc. And because Go is so new -- relatively speaking -- none of this has been done!

I've done mostly Python for the past 4 years and have enjoyed its infinite library support, though it's much harder to make one's mark for that reason.

Mongo is open source and the most common persistent NoSQL DB, so true lock-in shouldn't be an issue. ElasticSearch sounds interesting though; I'll look into it more.

The challenge is not so much where to store the data, but how do we get logging data into a datastore such that it is meaningful for analytics. Current logging frameworks such as log4j and log4go only write text formatted content to their various appenders or filters. In order to use text log data one would have to use extremely complex regular expressions to try to make sense of all the logging data.

Much like the movement towards the Semantic Web, we need to make logging data machine readable and understandable. So not only do we want to log a text message, but we should be able to log any semantic information along with the log entry. For example, a tracking_number, userid, application name, etc.

If we look at current log api's in ruby for example:

logger.info("Queried users table in #{duration} ms, with a result code of #{result}")

To add semantic information we could just pass in a second parameter, say call it payload, which is a hash of information related to the call:

logger.info("Queried table", {
   :duration => duration,
   :result   => result,
   :table    => "users",
   :action   => "query" } )

If the logger then writes this data to elastic search, MongoDB, or another document datastore then we can easily perform analytics against the text message and it's payload. Trying to do the same against text files with regular expressions or similar would be painful to say the least.

All that we need then is for the logging API to add a second parameter to the existing logging calls that takes a payload object that can be serialized or stored in a meaningful way.

Existing file or other appenders/filters would just serialize the second parameter and append it to the message as readable text before writing to file, etc.

Since the binding between the payload and the final datastore is externalized from the application, the appender/filter can be changed by the end-user to any datastore of their choice. This also avoids the lowest common denominator across data stores as each appender can take full advantage of the underlying datastore without affecting the application.

For an example of the enhanced logging interface, see https://github.com/ClarityServices/semantic_logger

@Steve

https://github.com/astaxie/beedb

while not on the nosql side it is an ORM for golang, it should be possible to
make drivers for nosql as well

On Wed, Sep 12, 2012 at 11:19 PM, Steve Phillips
notifications@github.comwrote:

@exsys https://github.com/exsys Hmm having one ORM for the many kinds
of fundamentally kinds of databases (e.g., Redis and MongoDB) seems hard to
do without just covering their greatest common denominator, which is
probably key/value pairs. I've definitely thought it'd be cool to have an
ORM for all, say, document-oriented DBs (like Mongo and Couch), then another
one
for all columnar DBs (e.g., HBase and Hypertable), etc. And because
Go is so new -- relatively speaking -- none of this has been done!

I've done mostly Python for the past 4 years and have enjoyed its infinite
library support, though it's much harder to make one's mark for that reason.

Mongo is open source and the most common persistent NoSQL DB, so true
lock-in shouldn't be an issue. ElasticSearch sounds interesting though;
I'll look into it more.


Reply to this email directly or view it on GitHubhttps://github.com/bketelsen/skynet/issues/99#issuecomment-8519256.

why not keep it simple follow the same method stathat.com uses, then have a
ORM layer to make it so any driver can be developed for storage, even a
driver for stathat.com itself if one so wishes to use it for analytics. If
you watched the talk on elastic search it is really easy to slice and dice
on this type of data for advanced near real time analytics.

On Fri, Sep 14, 2012 at 12:10 PM, Reid Morrison notifications@github.comwrote:

The challenge is not so much where to store the data, but how do we get
logging data into a datastore such that it is meaningful for analytics.
Current logging frameworks such as log4j and log4go only write text
formatted content to their various appenders or filters. In order to use
text log data one would have to use extremely complex regular expressions
to try to make sense of all the logging data.

Much like the movement towards the Semantic Web, we need to make logging
data machine readable and understandable. So not only do we want to log a
text message, but we should be able to log any semantic information along
with the log entry. For example, a tracking_number, userid, application
name, etc.

If we look at current log api's in ruby for example:

logger.info("Queried users table in #{duration} ms, with a result code of #{result}")

To add semantic information we could just pass in a second parameter, say
call it payload, which is a hash of information related to the call:

logger.info("Queried table", {
:duration => duration,
:result => result,
:table => "users",
:action => "query" } )

If the logger then writes this data to elastic search, MongoDB, or another
document datastore then we can easily perform analytics against the text
message and it's payload. Trying to do the same against text files with
regular expressions or similar would be painful to say the least.

All that we need then is for the logging API to add a second parameter to
the existing logging calls that takes a payload object that can be
serialized or stored in a meaningful way.

Existing file or other appenders/filters would just serialize the second
parameter and append it to the message as readable text before writing to
file, etc.

Since the binding between the payload and the final datastore is
externalized from the application, the appender/filter can be changed by
the end-user to any datastore of their choice. This also avoids the lowest
common denominator across data stores as each appender can take full
advantage of the underlying datastore without affecting the application.

For an example of the enhanced logging interface, see
https://github.com/ClarityServices/semantic_logger


Reply to this email directly or view it on GitHubhttps://github.com/bketelsen/skynet/issues/99#issuecomment-8572426.

@bketelsen How high a priority is this? Let me know if it makes sense for me to tackle it.

This isn't a huge priority right now, it was only meant as a way to abstract the MongoDB requirement from skynet.

Definitely leaning more towards #194 putting mongo logger in it's own package that can be included by people who want it.

Opinions? if the consensus is towards #194 then this ticket can be closed

I agree... asked a long time ago for it to be pluggable on this list.

Not a fan of mongo... I would rather fire off to a message que and store in
the highly searchable elasticsearch
On Mar 28, 2013 9:08 AM, "Erik St. Martin" notifications@github.com wrote:

Definitely leaning more towards #194https://github.com/skynetservices/skynet/issues/194putting mongo logger in it's own package that can be included by people who
want it.

Opinions? if the consensus is towards #194https://github.com/skynetservices/skynet/issues/194then this ticket can be closed


Reply to this email directly or view it on GitHubhttps://github.com//issues/99#issuecomment-15597207
.