Simwood honeypot data available for download

Simon Woodhead

Simon Woodhead

18th July 2012

As we’ve said before, we do not wish to profit from our customers’ misfortune and do everything we can to help them avoid  equipment being compromised or to contain the damage if it has. In brief:

  • ThreatSTOP to stop traffic to/from known bad addresses passing your firewall
  • Fraud monitoring in the Simwood API – alert yourself when traffic breaks your business rules
  • Known ‘bad’ numbers blocked Simwood-side
  • Active monitoring of account behaviour and human intervention where possible

The last two have caught many attacks before they have had opportunity to cause too much harm but the fact remains we cannot catch everything and can only do this on a best-endeavours without warranty basis. It is a fact that the customers employing one  or both of the first two tools have not been compromised to our knowledge. We strongly encourage all customers to make use of them and talk to us if they need advice on policies to reduce the likelihood of compromise. For those who are pro-active we today add another ‘tool’ to the list.

Last year we announced a few security initiatives, notably our SIP Honeypot. Along with our Darknet it has been providing invaluable intelligence for us since. The raw data has also been made available to a number of security feeds. Effective today, it is being made available for all to download. Please see the terms of use at the bottom of this page.

It is crucial to understand what this data represents. It is an analysis of the contents of SIP packets targeting our honeypots and it is not an analysis of compromised customers. Similarly is not a certified blacklist of IP addresses (as we offer with ThreatSTOP) or a known list of bad numbers (as we filter internally). It is simply the traffic hitting our honeypots and whilst summarised it has not been filtered or processed in any way. Please therefore use it with consideration and at your own risk.

We offer four data sets:

  • IP – the apparent source IP address. Note, whilst no legitimate traffic should hit the honeypot this list will include the IP addresses of misconfigured equipment and the addresses shown could well be spoofed. IP addresses are ordered by frequency with a count and percentage figure.
  • Agent – the user agent identified in the SIP packet. User agents are ordered by frequency with a count and percentage figure.
  • Expanded – All of the above! This is the apparent source IP, method and user agent order by frequency. It also includes the earliest and latest timestamp for matching packets within the period that the file represents.
  • Dialled – this includes the SIP method and the contents of the ‘to’ field. This can be filtered by method to identify common brute force user-names or the number patterns used in successful attacks. We include all methods seen so an INVITE wasn’t necessarily successfully authenticated unless you also see an ACK and usually a BYE. Records are ordered by frequency with a count and percentage.
  • Full – all of the above! This shows the apparent source IP address, method, user agent, contents of the ‘to’ field ordered by frequency and including an earliest and latest timestamp (within the period that the file represents).
Each data set is available for four time periods as follows but if a file is missing there are no contents for the period under consideration:
  • 60m – the last hour. This is updated every minute and shows attacks almost real-time.
  • 24h – the last day. This is updated every 5 minutes.
  • 30d – the last month. Updated every hour.
  • 1y – the last year. Updated weekly.
The data sets are quite small and therefore hopefully useful. Keep in mind they are summarised and the underlying data represents more than 30m packets at the last count. How you use them is up to you but do please let us know how you get on.

* Data is provided as is and strictly without warranty. Use it at your own risk and please take time to understand what it means before doing so. Data may be freely used on the understanding that you will credit Simwood as the source where possible, linking to this page. It may not be redistributed in any form without permission. If we observe these conditions not being followed, data access will be withdrawn without notice. Please play fair.

Related posts