Anda di halaman 1dari 44

Gotta Catch Em All!

I nnoculous : Enabling Epidemiology


of Computer Viruses in the
Developing World
- by Michael Paik

Project Report By :
Mansi Gupta (3013021) and Malavikka Sharma (3013020)
B.Sc. (H) Computer Science (VI
th
Semester)
Hansraj College
University of Delhi
2012
INTRODUCTION
What is a computer virus ?
A virus is a small piece of software that piggybacks on real
programs in order to get executed. Once its running, it
spreads by inserting copies of itself into other executable
code or documents.

The Problem

Among all the problems which a computer user in the
developing world faces today, the most pernicious one is
prevalence of computer viruses, which have immediate
and unexpected costs.

However, it is difficult to pin down the reliable figures
about the rates and types of infections, as well as scale of
damage done because these rates only reflect reports from
legally purchased copies of antivirus software run on
internet-connected machine, and not the preponderance of
software in the developing world, which is illegally
obtained, out of its license period, or operated offline and
therefore not updated.
The Global Infection Rate map by McAfee
Labs.

Virus Infections per million citizens from all viruses:
Virus Infections per million citizens from top 10 viruses:
While data aggregated at this level is inconclusive, the
difference between North America and the developing
regions in this regard is remarkable in that it strongly
suggests that the specific virus types present in the
developing world, while high in absolute infection rate
display a different ecology than that in the developed
world.

Anecdotal accounts by experts on the ground put the
figure of infection rates in the developing world at up to
80% indicating a well and truly endemic problem, a figure
corroborated by recent surveys by Bhattacharya et al.
conducted in Bangalore, India.

The prevalence and impact of viruses is summarized in the
Figure :

As evident in the figure, 80% of centers experience
moderate to high prevalence of computer viruses, where
moderate indicates regular infections that cause
considerable problems and high corresponds to
continuous, highly detrimental infections.
It also summarizes the average expense on antivirus
software, grouped according to the severity of the virus
problem in a given location.
While the expense are highly variable it is evident
that investment in antivirus software is not sufficient to
spare a shop owner from the problems.
In addition malware authors distribute their s/w in infected
version of popular pirated s/w.
Internet security firm Intego in 2009 discovered a new
Trojan horse in pirated copies of Apples iWork 09
productivity s/w that could allow hacker to take
control of infected computer.

Research attributes viruses as originating from USB sticks
in addition to Internet websites. It also cites SD cards as
frequent vector of virus infections.

The author of the research paper thus presented and
described,
INNOCULOUS : a system consisting of a specially crafted
USB key, software and an incentivization strategy aimed
towards disinfecting machines, creating revenue streams
for small business and individuals in developing world and
obtaining rich information about computer virus
infections,
in proceedings of the 5th ACM workshop on Networked
Systems for Developing Regions (NSDR) 2011, Washington
DC , June 2011
DESIGN
Inspiration
Innoculous was inspired by Disk Knight, a security
software developed by a Bangladeshi student to protect
computers against malicious programs that use USB
memory sticks to spread.
Its idea was simple : if a USB key is protected by Disk
Knight the program will prevent the launch of any other
process on the computer and display a message prompting
the user to block or allow the starting process.

However there was a problem in its implementation.
Disk Knight once installed starts copying itself onto
every unprotected USB key, making it protected.
Furthermore, when this new protected USB key is
inserted into another system, Disk Knight would run and
install itself onto that system without users consent.

This makes it a computer virus in itself.

Disk Knight has been classified as PUA (potentially
unwanted application).
Environment
Innoculous was designed specifically to address infections
on Windows Platform, particularly XP variant because a
vast majority of virus infections in the wild are on this
platform due to it popularity. (The Windows family covers
for over 80% of total market.)

2012 Win7 Vista Win2003 WinXP Linux Mac Mobile
February 48.7% 4.5% 0.7% 30.0% 5.0% 9.1% 1.3%
January 47.1% 4.7% 0.7% 31.4% 4.9% 9.0% 1.3%
2011 Win7 Vista Win2003 WinXP Linux Mac Mobile
December 46.1% 5.0% 0.7% 32.6% 4.9% 8.5% 1.2%
November 45.5% 5.2% 0.7% 32.8% 5.1% 8.8% 1.0%
October 44.7% 5.5% 0.7% 33.4% 5.0% 8.9% 1.0%
September 42.2% 5.6% 0.8% 36.2% 5.1% 8.6% 0.9%
August 40.4% 5.9% 0.8% 38.0% 5.2% 8.2% 0.9%
July 39.1% 6.3% 0.9% 39.1% 5.3% 7.8% 1.0%
June 37.8% 6.7% 0.9% 39.7% 5.2% 8.1% 0.9%
May 36.5% 7.1% 0.9% 40.7% 5.1% 8.3% 0.8%
April 35.9% 7.6% 0.9% 40.9% 5.1% 8.3% 0.8%
March 34.1% 7.9% 0.9% 42.9% 5.1% 8.0% 0.7%
February 32.2% 8.3% 1.0% 44.2% 5.1% 8.1% 0.7%
January 31.1% 8.6% 1.0% 45.3% 5.0% 7.8% 0.7%
2010 Win7 Vista Win2003 WinXP W2000 Linux Mac
December 29.1% 8.9% 1.1% 47.2% 0.2% 5.0% 7.3%
November 28.5% 9.5% 1.1% 47.0% 0.2% 5.0% 7.7%
October 26.8% 9.9% 1.1% 48.9% 0.3% 4.7% 7.6%
September 24.3% 10.0% 1.1% 51.7% 0.3% 4.6% 7.2%
August 22.3% 10.5% 1.3% 53.1% 0.4% 4.9% 6.7%
July 20.6% 10.9% 1.3% 54.6% 0.4% 4.8% 6.5%
June 19.8% 11.7% 1.3% 54.6% 0.4% 4.8% 6.8%
May 18.9% 12.4% 1.3% 55.3% 0.4% 4.5% 6.7%
April 16.7% 13.2% 1.3% 56.1% 0.5% 4.5% 7.1%
March 14.7% 13.7% 1.4% 57.8% 0.5% 4.5% 6.9%
February 13.0% 14.4% 1.4% 58.4% 0.6% 4.6% 7.1%
January 11.3% 15.4% 1.4% 59.4% 0.6% 4.6% 6.8%
2009 Win7 Vista Win2003 WinXP W2000 Linux Mac
December 9.0% 16.0% 1.4% 61.6% 0.6% 4.5% 6.5%
November 6.7% 17.5% 1.4% 62.2% 0.7% 4.3% 6.7%
October 4.4% 18.6% 1.5% 63.3% 0.7% 4.2% 6.8%
September 3.2% 18.3% 1.5% 65.2% 0.8% 4.1% 6.5%
August 2.5% 18.1% 1.6% 66.2% 0.9% 4.2% 6.1%
July 1.9% 17.7% 1.7% 67.1% 1.0% 4.3% 6.0%
June 1.6% 18.3% 1.7% 66.9% 1.0% 4.2% 5.9%
May 1.1% 18.4% 1.7% 67.2% 1.1% 4.1% 6.1%
April 0.7% 17.9% 1.7% 68.0% 1.2% 4.0% 6.1%
March 0.5% 17.3% 1.7% 68.9% 1.3% 4.0% 5.9%
February 0.4% 17.2% 1.6% 69.0% 1.4% 4.0% 6.0%
January 0.2% 16.5% 1.6% 69.8% 1.6% 3.9% 5.8%
Data Logging
Computer data logging is the process of recording events,
with an automated computer program, in a certain scope
in order to provide an audit trail that can be used to
understand the activity of the system and to diagnose
problems.

As one stated goal of Innoculous project was to acquire
rich data about virus infections, a writable medium was
necessary

After considering several alternatives, a single self
contained USB key was selected with additional effort to
ameliorate the infection problem.
Infection Cleaning
Viruses target various type of transmission media or hosts.

Binary Executable files.
Volume Boot Records of floppy disks and hard disk
partitions.
General purpose Script files.
Application specific script files.
System specific autorun script files.
Documents that contain macros.
Arbitrary computer files.
One of the primary goal of Innoculous was cleaning of
virus infections which necessitates an anti virus solution.

This lead to two important design considerations :-
C Innoculous needed a self-contained and preferably
scriptable, command line interface.
C Measures must be taken in order to prevent
disabling of the anti virus engine or corruption of the
logs by viruses that might exist on the machine being
scanned.

antivirus fulfilled these
requirements and was thus selected. Moreover it was
explicitly free for use for not-for-profit and research
purposes
Infection Prevention
Windows variants from Windows 2000 through to
Windows7, recognize only the first partition that exists on
any USB memory key, and do not themselves have any
capability to create multiple partitions on such devices.
In observance of this fact,
Innoculous was installed on a second partition on a USB
stick, after a dummy 1 megabyte NTFS partition (the
minimum size), which is presented to Windows.
In order to partially mitigate USB threats, this 1
megabyte partition has its entire capacity occupied by a
dummy file with a known hash, making the partition
tamper evident and proving too small for many infections
with large or advanced payloads.

In addition, the small size of this partition will
discourage users from storing their own personal data on
these devices.

IMPLEMENTATI
ON
Custom Scripting
The script of Innoculous is written in VB Script.

It has the following functionality:

1. Displays the keys hardware ID/serial number.
2. Presents the user with an option to replicate a child key.
3. Asks the user for the PIN, ZIP or other postal code of
their current location, if available.
4. Presents the user with an option to start a scan. If a scan
is started:
+ Records serial numbers of all hard drives in the system.
+ Begins scan using Panda Antivirus, storing verbose
logs.


+ Deactivates Autorun using command-line registry
editor.
+ Records salient information about machine including
Windows serial number, installed patches, etc.

5. If network connectivity is available:
+ Checks for updated virus definitions from a
preconfigured IP address
+ Compresses and uploads any existing scan logs
+ Records system time skew against NTP server.
WinPE
Innoculous is implemented using Windows PE 3.1 32 bit,
which provides a preinstallation environment based on
Windows 7 SP1.

Windows Preinstallation Environment (aka Windows PE or
WinPE) is a lightweight version of Windows XP, Windows
Server 2003, Windows Vista, Windows 7 or Windows Server
2008 R2 that is used for the deployment of workstations and
servers. It is intended as a 32-bit or 64-bit replacement for
MS-DOS during the installation phase of Windows, and can
be booted via PXE, CD-ROM, USB flash drive or hard disk.

USB Key Preparation
A USB key of at least 2GB in size is necessary for
Innoculous to run.
It was prepared on a Linux machine using the following
steps:

C Using parted , an NTFS partition is created from
1023kB to 2MB. This creates a 1 megabyte (1024kB)
partition, which is the minimum size supported by any
modern filesystem supported by Windows.

C Using mkntfs , the NTFS partition is formatted to
NTFS

C parted is then used to create and format a FAT32
partition comprising the remainder of the device.
C A Windows PE image is imaged onto the FAT32
partition using dd or partimage.

C install-mbr or other Master Boot Record program
is used to install the MBR onto the USB key and point it to
the second partition, e.g.
install-mbr -p2 -e2 -v /dev/sdb.

Using the output from fdisk -ul, the start boundary
is encoded into hexadecimal using, e.g. printf, and
inserted in little-endian format at position 0x1C of the
second partition. This can be done using any hex editor,
such as hexedit on the device, e.g.
hexedit /dev/sdb2.
Deep Forensics
As the Innoculous installation, when run, has access to all
files resident on the host machines drives, it is possible to
copy various files from the computer for forensic analysis
regarding behavior. Access to these data, properly
redacted, could prove to be a significant source of insight
into infection vectors and browsing habits in the
developing world.

This functionality, however, is not currently implemented
given the murky ethics surrounding the issue of privacy.
DISTRIBUTION
Replication
The script that serves as the core of Innoculous also
contains the ability to replicate the entire system to
another USB key. It does this using the Windows AIK
(Automated Installation Kit), builder binaries as well as
Windows versions of partitioning tools to create a direct
copy of itself.

In the process of replication,
The parent key records the serial number of the USB
device it is replicating itself to.
In addition, the replicated key is initialized with the
hardware value of its parent, creating a bidirectional link
that, as the keys are replicated, creates a graph of keys.
Incentivization

The graph of keys is critical to the incentivization model,
essentially a bounty on new virus types encountered and
number of machines scanned.

In order to encourage users of the system to replicate their
keys and give them to others , a system analogous to the
MIT Red Balloon Challenge Team which was used during
the DARPA Network Challenge was adopted.
+ The challenge was to be the first to submit the
locations of 10 moored, 8-foot, red, weather balloons at 10
fixed locations in the continental United States.

In this model,
bounties would be
paid out starting with
the finder and then
geometrically smaller
proportions to the
finders parent,
grandparent, etc.
Explicitly,1/2 would
be paid out to the
finder,1/4 to the
parent,1/8 to the
grandparent, etc:
Rs.4, Rs.2, and Rs.1.

While these amounts are small, given the large numbers of
infected machines and potentially multiple infections per
machine, this could represent a notable revenue stream in
the developing world.
Controls
In order to maintain reins on the system, several controls
may be optionally implemented on the keys:

A usage-based suicide gene that would wipe the key
once n scans had been completed and uploaded at
some internet-connected machine.

A time-based suicide gene that would wipe the key
at a given date, verified against a known NTP server
on some internet-connected machine.

Generational limits for how many generations from
the first tier of keys distributed may be replicated.


Invoked self-destruct that, when triggered by the
server, will cause the key to delete itself upon its next
check for virus signature updates.

Invoked disabling of self-replication, forcing any
given key to be a leaf node in the graph.


ANALYSIS
The data Innoculous provides can be used for various
analysis:

+ GEOGRAPHIC SPREAD ANALYSIS
This could illustrate the spread levels and densities
of particular strains of viruses over a region.

Differences in geographic distribution
of viruses before 2003
Before viruses turned into
money making machines,
they were mostly done in
developed western nations,
like Europe, USA, Canada,
Japan, Australia.
Today the biggest hotspots
are Russia, Ukraine,
Kazakhstan, Romania,
Moldova, China obviously,
and South America,
especially Brazil, which is
the biggest source
of banking trojans which
steal money during online
banking.
Differences in geographic distribution of
viruses after 2003

By 2009,there were even more advanced viruses and now
the amount of infected machines around the world is in the
millions
+ STRAIN ANALYSIS
Based on the birthday of each strain of virus, worm, or other
malware, it is possible to determine certain data regarding the
age, spread rate and infection vector of observed viruses.

+ REINFECTION
As some machines will likely be scanned more than once given
a sufficiently large network of Innoculous keys, data will
emerge regarding subsequent re-infection of machines that
have been cleaned before .

+ PIRACY ANALYSIS
Determining what proportion of Windows installations are
genuine and which may have come infected with viruses.
Nearly ten times as many Windows XP SP3 systems get infected as
Windows 7 SP1 64-bit systems. Even Windows Vista with its latest
service pack installed reports only half of the infection rate than
what Windows XP reports.
CONCLUSION
The use of the Innoculous system, if widespread, will
provide the research community with a detailed corpus of
data regarding virus infection rates and types at low cost
while simultaneously providing revenue streams for small
business owners and individuals in the developing world
and raising awareness of the problems presented by virus
infection.

As a bonus, it also will provide a social network graph of
people in the region(s) in question who are likely to be
considered local computer power users, information that
could help establish a valuable social network in deploying
future projects.

Anda mungkin juga menyukai