Anda di halaman 1dari 9

Generation of Large Archives for Projects on worlds

leading Project Collaboration Platform and Online


Project management system
Introduction
Aconex is worlds leading Project Collaboration and online project management
system. The main agenda of the engagement was to provide clients with their
own copy of project data as an archive. As part of products handled by Imaginea,
this archiving process was built and various other peripheral products were
developed. These products included Project Archive for building and shipping
archives to the clients, Local Copy for clients to have a self-updating archive for a
live project, generic License Generator for licensing these products (or any other)
and Blinky for caching the downloads from Aconex Online system locally in the
organization.
Imaginea, with its expertise in product development, enabled Aconex to build
these products in a short span of less than a year, with a small team of around 8
people. The product was delivered with high quality, high configurability and
performance criteria specified by Aconex.

Challenges

The biggest challenge of the product was that each archive (project) was
based on a different set of fields specified by a different schema. Because
of this the use of standard POJOs and ORM was difficult. For this reason,
we used JDBC based generation of database and other queries.
It was fixed time and fixed cost project to start with and a large part of
functionality of Project Archive was built in the first 3 months of the
exercise. Good break-up of the work and parallel processing helped a lot
here.
Large amount of data to be archived. Archive generator is supposed to
build archive having 1 million documents and 1 million mails. The actual
data of the archive could be multiple terabytes. So network bandwidth was
tested.
Performance of the archive viewer was supposed to be very high and in
spite of large amount of data in the archive, any page, including searches,
should have a sub second response. Technologies like Lucene were used
to achieve these goals.
This was not just regular web application, but required an installer based
setup, which was extra work at each stage. IzPack based installers were
developed for most of the products.
Keeping price low for the product was also one of the challenges. Product
stack was chosen to keep the prices to the minimum. Wherever possible
open source products were used.

Meeting expectations and proactively designing UI for the client which is in


synch with their current site.
As the client used to get an installer, upgrade of these applications was
particularly challenging as no manual intervention can be done for
upgrade. A robust upgrade system, with recovery on failure was provided.
We needed to track the database and relevant application changes from
version to version for this.

Products
1. Project Archive
This was the first product. The goal of the product was to deliver the
archive consisting of documents and mails to the client as an installable
product. This product had two parts.
Project Archive Generator Generator was hosted in Aconex network
and is used to generate the archives.

Uses REST APIs exposed by Aconex to collect data for the archive.
This includes documents files, metadata and mails.
Designed as a multithreaded processor for optimizing the archiving
process. Multiple archives can be concurrently built, with each
archive building more than one mail/document concurrently.
Output of the generator is an archive consisting of Mails and
documents with its metadata as database dump and lucene index
for faster searches on the viewer.
Each archive was based on a different project schema (different and
different number of fields). For this reason, the DB table was
dynamic and was generated separately for each project. This was
one big challenge as ORM couldnt be used straight away.
Usable operations such as pause, resume, cancel etc.. were
provided over and above the features discussed in Fixed Bid
contract.
Also progress of archiving process was depicted as steps
(documents, mails, attachments)/number X of Y.
Detailed error handling was done in generator and multiple
configurations were provided for the same. Errors were displayed to
user in detail and resolutions/workarounds suggested.

Project Archive Viewer Viewer is a web application to browse and


search documents and mails.

Dynamic search UI based on searchable fields in the schema.


Search can be on a single value, or it can be with multiple values.
Search is based on Lucene indexes to make it faster.
Client can install more than one archive on the same installation.
Data searched on the UI can be exported to excel file.

2. Local Copy
Local Copy was an extension to the Project Archive. Here a local copy of
clients data can be maintained in the client premises. Same as Project
Archive, data would be downloaded from Aconex REST APIs and a locally
archived copy will be created. Extra thing here was to have a regular
scheduled update of data from the online system.

Scheduling option for daily/weekly/monthly options.


Different look and feel
As client will be accessing the online APIs directly, so to restrict
access, Local Copy was made a licensed product and any updates
required license to be verified. This license was verified for user and
the project for which it was issued.
Had most of the features of Project Archive. Both generator and
viewer were present in a Local Copy installation.

3. License Generator
License Generator was a generic license generator, which was primarily
built for Local Copy licenses. Though this was designed and developed in a
very generic manner. We can specify any set of properties and values to
be verified. A generic client side code was provided to have the licenses
validated.

Generic license generator to generate license based on key value


pairs of the properties required for verification. This can be as it is
used for any java based product to have license generated based on
product specific properties.
We used open source product True License which provides JCE
based encrypted licenses. We verified various other licensing
products before freezing
This product had provisions to add products (licensed products),
versions and creating licenses for those different versions.
A generic verifier was also provided for licensed product to include
and verify licenses.
The product was designed to keep in mind the possible future
enhancements like usage by a .NET product, IP based license
verification etc... Later these pieces can be added.

4. Blinky
Blinky was a product created to save the bandwidth for clients. It was a
local file cache in the clients organization, which could cache the
documents downloaded from Aconex online system and serve them
locally if they are requested from the same organization again. This was
developed to an Alpha product in just a single sprint of 2 weeks. With one
of the senior Aconex developer visiting us, we were able to work very
optimally with changes on online system and Blinky made together. We
kept the whole set up ready even before Aconex developer visited us and
were ready to go once he was in.

A local cache of documents in clients organization


Ajax call from Aconex online application to Blinky used to get the
document if it is in cache. Otherwise, Blinky gets the document
from the Aconex online application.
Cache statistics were displayed to showcase the benefits from
caching.
Eviction from cache was implemented to make sure latest and more
relevant documents stay in the cache

Process
The account is a very nice example for the adoption of SCRUM methodology. As
Aconex was following SCRUM in their organization, Imaginea started with some
little agile work in the initial fixed bid part of the project. We tried to finish tasks
per sprint (2 weeks) and demonstrated per milestone (1 month).
Later, with more involvement of client, we also took a deeper dive into the
SCRUM process. It surely needed a change of mind-set to accommodate and
practice that. The following were the things we learnt as part of this (please
refer to SCRUM documentation for standard practices.)

Breaking up a story into small and possibly simple stories (size 5 for us,
representing things doable within 2 days) made the estimations and
execution much better. With bigger sizes (8 or more, representing more
than a couple of days) resulted in too many tasks queuing up in the QA
column and making end of sprint more hectic.
We learnt how to break a story into deliverable stories, instead of pure
developer or pure QA kind of stories. This made the deliverables of the
stories more meaningful and easy to track.
We used SCRUM board for better visibility and tracking. We used to send
pictures of daily SCRUM board (along with the stand-up meetings details)
to Aconex to keep them involved.
We faced trouble initially with the QA-Dev ping pong. This was overcome
by making Kick Off of the stories extensive. In the Kick Off meeting, all the
parties were involved and made sure they understand the deliverables of
the story. Test plans were built on the details discussed in the Kick Off.
Also for the problem in the last point, we used a pre-check-in
demonstration to QA so that developer and QA are in synch with each
other and check in doesnt result in something mismatching the
expectations of QA.
If QA column was overloaded, developers participated in the QA activities
to ease the load.
With more than 4 products being worked on simultaneously, we needed to
keep a dedicated regression day towards the end of the sprint. Estimations
were built based on that

Performance
With the amount of data in these products, performance was very important. All
the pages were supposed to render with sub-second response. Other
performance related points

For sub-second responses in search, we used Lucene to index metadata


and provide search results. Index was created during archiving process.
Performance of pages were observed from Firefox Firebug plugin for
various pages
Later Funkload was used for load testing from QA
Pagination of result was implemented for faster user experience.
Process of archiving was made multi-threaded and configurations were
provided to tune the pool for fast archiving and update process.
We used Jetty server locally instead of Aconex online to test the archiving
of 1 million documents. This is because Aconex online system access was
very slow over internet.

Delivery
Project Archive was the first product to be developed and it was delivered as a
fixed bid project. The whole product was delivered in sharp 3 months. After this,
first version was also developed in the fixed bid mode only. During this time, we
had demonstrations every milestone and we implemented feedback from
Aconex.
Later the team was working as time and material team. Local Copy, Blinky,
License Generator were delivered as part of this delivery model. With the trust
of Aconex, this model was initiated and was more cost effective to Aconex.
Installer based setup was the final deliverable for Project Archive, Local Copy and
Blinky products. License generator was a script based setup as it was not an end
client product.
During time and material exercise, we followed Agile SCRUM methodology. Every
sprint (2 weeks) we gave demonstrations to the client. After this, build was
delivered with suggested feedbacks before the sprint end.
Source code was also delivered with each of the product delivered.

Technologies
Tomcat
JDBC
Spring
MySQL
JQuery
Jibx
Apache
HttpClient
Apache POI
Velocity
Quartz
Hibernate
JFreeChart
Cobertura
Sonar
Hudson
Funcload
Brighttest

Persistence APIs for all products except License generator and


Blinky

EML generation was template based


Scheduling in Local Copy
Statistics in Blinky
Code coverage
Code quality monitoring
Continuous Integration
Load testing of Blinky
Automations testing

Quality
Quality of deliver was a big challenge as a small QA team needed to take care of
4 products as a time. The following practices were established to overcome this
challenge.
Detailed Kick Offs
Before developers started on any story, there was a detailed kick off to avoid too
much of ping pong between developers and QA. This involved all the parties
concerned, i.e. developer, QA, client if required. All the expectations from the
story were clearly communicated amongst these parties.
Test Plans
Based on the kick off discussion and QA expertise, test plans were created and
got a quick review from developers and in case needed client. These were
uploaded in the JIRA and tracked from there. Also these were followed during the
testing cycle. This also avoided dependency of a particular resource.
Unit testing
Unit test were built by the developers wherever possible. Estimates included
time for unit tests too. Some developers also practices test driven development
where they developed the tests first and then started filling the code to support
the tests.
Developer demonstrations to QA before testing
Before check in, developers demonstrated the functionality to QA person to
make sure there are no apparent problems/expectation mismatches. This ways

ping pong (hence time and communication) was saved between developer and
QA.
Integration Tests
Integration tests were built to test integration of the product. Some of the tests
were complicated as the generator and other products have a lot of
multithreaded processing. Configurable tests were written to see the behaviour
of product with different number of threads. To avoid lot of network load and
delay, Jetty server was used as a mock up server.
Sprint Demo to Client
Sprint demo to client ensured that the expectations were matching. In case of
some changes, build was usually provided with the fixes in the same sprint.
Automated Unit tests with Continuous Integration
In a running machine, a single problem can cascade to multiple and can cause
bigger issues. Same is the case with complex products too. For this reason,
continuous monitoring of quality is very important.
For this reason, Hudson was used to create nightly builds which executes on
every day midnight. Its also configured to poll on the Source control for every 30
minutes for any code check-ins by developers. If it finds any check-ins it builds
all the products. In case of any build failures, which includes failures of any test
cases it sends the feedback to the respective stake holders in the system and
the issue will be fixed immediately.
Also Cobertura was integrated to observe the coverage of the code. It provides
detailed view of what percentage/part of the code was covered and what needs
further automated tests.
Code Quality Monitoring
SONAR was integrated to monitor code quality. It used to produce reports on
code and design standards, which were made centrally available. These reports
are reviewed every day and consider the changes suggested by SONAR that are
relevant to the product, thus maintaining the quality of product continuously.
Automated QA Tests
QA automated a lot of functionality using in house automation tool BrightTest,
which is built over Selenium. This was very helpful in completing the test cycles
faster.
Load Testing
Funkload was using to test Blinky with variable load. Predefined scripts were run
for configurable number of concurrent users.
Regression Testing
As the functionality of products grew, it was difficult to maintain it with the same
resources. For this reason, we kept a day towards the end for regression
activities. This was specifically done when a product version was due for a
release.

Imaginea Edge
Imaginea carries expertise in both Core Product development and Web
Application building. Both these were the core for the Aconex exercise. This
expertise helped Imaginea to deliver the products involved in an optimal and
timely way saving cost for client.
The products were not only functionally delivered, but also gone through Product
development cycle with Agile process. The maintainability of the source and
resources were instrumental in achieving this.
Other core technology expertise of Imaginea also came handy including Spring,
JDBC, Hibernate, Development based on dynamic schema, MySQL etc
Proactive Progress
Imaginea Engineers have proactively explored REST APIs and kept
themselves ready for the assignments.
Imaginea Engineers explored Aconex application for functionality and
made sure we understand the product.
Imaginea Engineers provided various inputs to Aconex with UI mockups
wherever needed.
A lot of functionality was built to make application more usable. These
were not directly part of the contract/communication. E.g.
Pause/resume/activation of prebuilt job etc
Imaginea team adopted SCRUM process to be as close to Aconex
technically as possible. This made the communication very fluent.
Imaginea team implemented continuous integration using Hudson and
observed code quality using SONAR. Cobertura was used for code
coverage. This was all done in busy sprints.
Imaginea Team kept the setup ready for a Blinky so that it can be done the
best when Aconex engineer visit for brief time.
For getting over the problem of network and Aconex test site availability,
we devised local Jetty server for testing.
Product Expertise
The product background of Imaginea helped the product line of Aconex to be
delivered with good amount of maintainability. Multiple cycles of changes and
refactoring was done to keep code and resources in minimum technical debt.

Archiving process was optimized with the multithreaded processor and the
configurability.
Working on dynamic schema based product helped Imaginea to develop
Project Archive more efficiently.
Multiple product were handled at the same time by the same team.
Expertise in logging helped Imaginea to maintain and resolve bugs for
Archiving products very effectively. Client problem were easily identifiable.
Expertise in Web2.0 based technologies helped Imaginea to quickly design
the product screens.

Large numbers of files were optimally encrypted / decrypted for secure


archives.

Summary
Imaginea tries to provide a lot of value in terms to efficiency of the work, process
and client interactions, so that the products are delivered seamlessly to the
clients expectations. This account with 4 products delivered in less than a year
is an example. The teams are dedicated to learn and optimize. The quality of the
products is inherited from the product background of Imaginea. Other than
features, performance, scalability and configuration, Imaginea engineers also
thrive to provide quality, modularity, testability and automation to work
optimally. We are also more than ready to get integrated with client processes to
make client most comfortable and have the best communication. Other than
this, Imaginea engineer are also proactive analyse, create POCs, try client
resources to make sure that the requirements are clear for all the parties.

References

Anda mungkin juga menyukai