Ideas for Smalltalk GSoC 2012 projects

Here we are collecting the ideas what projects would be good to have on this year GSoC.

 

Core

 

 

Minimal virtual machine

Level: intermediate

Possible mentor: Craig Latta

Possible second mentor: Eliot Miranda (I'm volunteering him :)

Description

Create a minimal Squeak virtual machine, to accompany the minimal object memory provided
by the Spoon project.

Technical Details

Benefits to the Student

The student will learn the organization of the Squeak virtual machine.

Benefits to the Community

The community will benefit from an improved virtual machine organization, making it
easier to learn and easier for newcomers to contribute. 


Object memory modularization

Level: intermediate

Possible mentor: Craig Latta

Possible second mentor: (volunteers sought)

Description

Modularize the Squeak object memory into Naiad modules (see http://netjam.org/spoon).

Technical Details

Organize the classes and methods of Squeak into composable modules using Naiad, Spoon's
history system. Develop tools to aid in this process.

Benefits to the Student

The student will learn how to create deployable Squeak applications, by helping
to invent the process.

Benefits to the Community

The community will benefit from improved techniques for collaboration and deployment. 

 

Bootstrapping the core

Level: Advanced

Possible mentor: S. Ducasse

Possible second mentor: I Stasenko

Description

The goal of the project is to support the bootstrapping of a image based on a list of operations.

Technical Details

This works is a follow up on the effort we started around hazelnuts  We will continue working on system tracer and we will revisit the decisions we took in hazelnuts

Benefits to the Student

Learn about in the internal of the system and bootstrapping techniques.

Benefits to the Community

Have a new kernel that builds from sources systematically 

 

Get Etoys image to run on CogVM

Level:

Possible mentor: Karl Ramberg

Possible second mentor:

Description

Image side changes to make Etoys image run on the CogVM. Possibly some plugin support work on Cog also needed.

Benefit for the Student

Work with a high performance JIT and implement support for this in the Etoys system.

Benefit for the Community

A faster execution would be nice for Etoys as it often is run on smaller systems and a few more cycles would make GUI response a nicer experience.

 

ARM jitter for Squeak VM

Level: Advanced

Possible mentor: Eliot Miranda

Possible second mentor:  Igor Stasenko

Description

The Squeak VM is the dynamic virtual machine used for many open-source software projects such as Scratch 1, eToys 2, Pharo 3, the Newspeak language 4, the innovative web framework Seaside 5 and many
others. CogVM 6 is a development of the Squeak VM which adds a powerful Intel x86 JITer 7. The CogVM JIT has significantly improved the performance of the open-source Smalltalk projects which have adapted
to use it. Increasingly low cost highly capable ARM hardware such as the Raspberry Pi 8 and the Beagle Board 9 have become widely available. In addition the new version of the one-laptop-per-child is based on the ARM
platform 10. The Squeak VM compiles for ARM platforms, but currently there is no JIT on ARM platforms, significantly decreasing the performance of popular software on ARM. The goal of this project is to
add simple ARM JITTing capability to the CogVM.

1 http://scratch.mit.edu/
2 http://www.squeakland.org/about/intro/
3 http://www.pharo-project.org
4 http://newspeaklanguage.org/
5 http://seaside.st/
6 http://gitorious.org/cogvm
7 http://en.wikipedia.org/wiki/Just-in-time_compilation
8 http://www.raspberrypi.org/
9 http://beagleboard.org/bone
10 http://one.laptop.org/about/xo-3

Technical Details

The work would require a interest in virtual machine optimisation, some knowledge of Intel x86 and ARM assembler and knowledge of C and dynamic languages. The Squeak and Cog VM are written in a simplified subset of Smalltalk known as slang 11 - which then generates C output and forms the basis of the VM. As the Squeak VM is a Smalltalk program, it is developed in Smalltalk, and the Cog JIT is no exception.  The VM, including the JIT, is written in Smalltalk and run in the context of the Smalltalk IDE, but the JIT still generates machine-code that must be evaluated within the Smalltalk environment.  On x86 is done by interfacing to an x86 simulator library derived from the Bochs x86/x86-64 PC simulator, written in C++.  Implementing the ARM port should be no different.  The first task will be to choose and interface to a suitable ARM simualtor/emulator.  Once
this is working, the ARM code generator can be incrementally developed within Smalltalk.  Finally once the simulator is fully functional one can get down and dirty with an actual physical ARM machine - such as the
Raspberry Pi or Beagle Board.

11 http://wiki.squeak.org/squeak/2267

Benefits to the Student

The student will gain an in-depth knowledge of virtual machine optimisation, working in a productive innovative environment - it's great fun to be able to implement a JIT in a safe high-level dynamic language, instead of the traditional route of developing in C/C++ and debugging in GDB. The student will have the satisfaction of seeing performance gains for a range of high-profile projects which use the Squeak VM on ARM.

Benefits to the Community

The Smalltalk community will gain an initial implementation of an ARM Jitter which can then be further developed along-side the x86 dynamic translation work. An ARM Jitter for the CogVM will improve the performance of many notable open-source projects on low-cost ARM hardware, bringing innovative software and development environments to wider community.

 

Hazelnut

Level: Intermediate

Possible Mentor: Stephane Ducasse

Possible second mentor:

Description

Hazelnut is part of the Seed project which goal is to bootstrap the system.

Technical Details

Ensure the validity of the created kernel and also improved the serialization mechanism. The goal is to clarify and ensure a creation mechanism which can be applied to a dynamically generated kernel or to a statically describe kernel as well.

Benefits to the Student

Deep understanding of the system layouts, of the kernel definition, of the meta model and the reflexivity of the system. The student will also learn the object format and the basic of the VM use.

Benefits to the Community

The community will gain a way to bootstrap a new fresh kernel from an existing image or from a kernel description. It could also be used to generate minimal kernel used for embedded technology.

 

Improving Squeak/Pharo support for running on >1 cores on the RoarVM

Level: Intermediate

Possible mentor: Stefan Marr

Possible second mentor: ?

Description

Currently, Squeak and Pharo make strong assumptions about the VM they run on. This includes the assumption that Smalltalk processes are not executed in parallel. The various tools and kernel parts are only to a certain degree thread safe. The goal is to improve the thread safety step by step to enable parallel execution.

Technical Details

This would be a first step support for parallel systems, independent of an actual VM. Involves in-image pure Smalltalk programming and requires a certain set of knowledge in parallel programming. The parts that are not threadsafe are typically easy to discover either by observing race conditions or finding patterns in the code that rely on single-threaded scheduling semantics.

Benefits to the Student

The student will get a deep insight in multicore programming and concurrency problems. This kind of experience is still missing from many curricula and is required in many software development projects.

Benefits to the Community

Improving support for today's hardware parallelism.

Bring RoarVM-ideas to the standard SqueakVM stack-interpreter

Level: Advanced

Possible mentor: Stefan Marr

Possible second mentor: ?

Description

The RoarVM uses a simple static approach to parallelism. The number of cores that are used is defined at startup time which allows to decrease the necessary complexity for thread support significantly. This approach could be applied relatively straightforwardly to the Squeak interpreter and eventually the CogVM. Existing libraries and approaches can be reused, but need to be ported to Squeak's Slang-based implementation.

Technical Details

Since the RoarVM is implemented in C++, the main part is to identify where which part of the RoarVM belongs in the Slang implementation. Furthermore, some of the libraries need to be translated from C++ to slang.

Benefits to the Student

Getting a deeper knowledge of a bytecode interpreting VM and low-level concurrency and performance issues.

Benefits to the Community

Enabling parallel execution of Smalltalk on a 100% compatible platform that has the same performance characteristics as the current VMs.

 

 

Tools

 

 

Search Indexing of Smalltalk image

Level: Intermediate

Possible mentor: Aik-Siong Koh

Possible second mentor: Ian Chai

Description

All Smalltalk development environments can now search for implementors, senders or strings using brute force. The results are usually just listed alphabetically. This project will use search technology to index and page rank all the packages, classes, methods and comments in the image. Search results will be returned quickly and ranked intelligently. It will be a Search Engine for Smalltalk in Smalltalk. A search browser will be created to accept search strings and return columns of hits for packages, classes, methods and comments listed according to their page ranks. Time permitting, an autocompletion capability will be implemented to suggest relevant methods for code writing. The code will be made portable to all Smalltalk dialects.

Technical Details

The Anatomy of a Search Engine http://infolab.stanford.edu/~backrub/google.html

Benefits to the Student

The student will learn the important areas of search technology and object oriented programming. The student will experience creating something that is immediately useful to all Smalltalk programmers.

Benefits to the Community

Fast and intelligent search of Smalltalk code will help the community advance Smalltalk development even faster. Additional programming capabilities can be built on top of the search capabillities. 

 

Image provisioning tool

Level: Intermediate

Possible mentor: Geoffroy Couprie

Possible second mentor:

Description

A lot of developers keep multiple images around for development, and a good practice is to start from a fresh image for each new project. Oddly, this process is not yet included in the common developer tools, so we rely on custom scripts (in the best case) or lots of drags and drops (in the worst case) to manage our development environment. We need a tool to simplify the image management and keep track of all our projects.

The goal of this project is to build a tool to could automatically create new projects from fresh images(ex: Core, Development) and load needed tools and classes (ex: Seaside), and manage those projects (list, delete, etc). This tool could also be used to deploy development code to testing and staging images, and build the production image.

Technical Details

The image provisioning tool would need a simple interface to manage the different projects, and be able to launch a new environment and load code in it. We already have Metacello, a configuration and dependency
tracking tool, which can be used to get all the needed packets to create a new image.

Benefits to the Student

The student would learn a lot about package management and software configuration management, and may be able to build a complete development workflow.

Benefits to the Community

The Smalltalk community will gain a useful tool to streamline their development process and gain time in research and development.

 

Big data CSV parser plugin

Level: Intermediate

Possible mentor: Hernán Morales Durand

Possible second mentor:

Description

With the advent of inexpensive DNA microarray technology, big data is now available to many small and medium laboratories which performs statistical analysis based in microarray experiments. Most of the times the data produced by genotyping services is delivered in CSV format, as it represents a currently cross-platform "standard" which is easily readable, and still used in hundreds of business applications. In Smalltalk we have several CSV parsers but the performance is far from being competitive with libraries implemented in other languages. The goal of this project is to measure time execution and build a plugin to access CSV data in a fast and competitive way.

Technical Details

Currently exists several open source projects which implements C functions to access CSV data. The challenge of this project is to learn tools like VMMaker and Interpreter Plugin classes to develop a Squeak/Pharo internal or external plugin.

Benefits to the Student

The student will learn about interfacing highly efficient libraries to Smalltalk.

Benefits to the Community

The Smalltalk community will gain a winning library for a extremely common task like dealing with CSV files.

 

HDF5 support

Level: Intermediate

Possible mentor: Hernán Morales Durand

Possible second mentor:

Description

Hierarchical Data File 5 is a new (1998) format capable of storing large and complex amount of data, and it is used in Gravitational and Plasma Physics, Earth Science research, Weather Services, Software Engineering,
Biomedical Informatics, etc. As new data adquisition hardware is providing bigger datasets (for example, sequencing data) the need to query and access metadata, partial and full datasets in an efficient way
(parallel I/O) is more important. In this format data are stored in a hierarchical format similar to the UNIX file system, and the data model supports a rich variety of data types and data space organizations. Currently exists APIs and wrappers for Java, .NET, Python, C and FORTRAN.

The goal of this project is to build a wrapper to enable to access HDF5 data in Smalltalk. This binding could open Smalltalk to a lot of science domains and users in which currently pure object technology is unknown.

Technical Details

The student will need to learn details about the HDF format as data sets and composite data types.

Benefits to the Student

The student would learn about efficient data systems, implement an API,
and experiment with large scientific data in Smalltalk.

Benefits to the Community

The Smalltalk community will attract more users by keeping in touch with big data analytics, by providing access to an efficient data format used currently in research and business.

 

Package management with Fuel

Level:

Possible mentor: Martin Dias

Possible second mentor:

Description

Fuel is a general purpose binary serializer. It already saves and loads classes without using a compiler. Package
management has additional challenges like check dependencies, run pre- and post-scripts, override existing classes or methods, tolerate superclass shape changes, run system validations, send notifications, clean uninstall, and others. Provide integration with current tools like Monticello, Gofer and Metacello would be good.

Benefit for the Student

Complete understanding of the life cycle of classes and packages in the system.

Benefit for the Community

Pharo/Squeak users will have the alternative to load classes without compilation. This would be useful in bootstrap image experimentation. This approach has proved great utility in other Smalltalk environments, for example in Parcels (VW Smalltalk).

 

Export Excel files and other spreadsheet related goodies

Level: Beginner/Intermediate

Possible Mentor: Carla Griggio

Possible Second Mentor:

Description

Integration with Excel (or other Spreadsheets) files is a very common request in the software industry, and Smalltalk it's lacking an open source tool to offer that feature.
This project basically aims to get Pharo and other open source Smalltalks to be able to export Excel and/or other Spreadsheets formats, so the systems that require serving real spreadsheets don't have to keep using the CSV file format as a substitute.

Technical Details

CSV files are usually used instead of real spreadsheets file formats when a software needs to improt data from a spreadsheet or export data in form of a spreadsheet. But that loses cell formatting and data type information that is often very necessary for the user.
The student will have to study different spreadsheets formats (for example: Excel and Open Office) to be able to read and write spreadsheets files.
Also, integration with Google Spreadsheets for software with Web user interface would be nice to have.

Benefits to the Student

This project could be ideal for someone just starting to use Smalltalk. It will help him learn the basic usage of the language and still deliver a very useful tool.
Also, the integration of a web framework with Google Spreadsheets will let him learn about building web interfaces with Smalltalk and interacting with a third party API, which is something widely used in software development.

Benefints to the community

Exporting and managing spreadsheets is very required by the industry. The Smalltalk users that build software for final users will now be able to offer real spreadsheets as a feature for their systems, instead of keep using the CSV files as the alternative.

 

Community-wide VM Performance tracking

Level: Beginner

Possible mentor: Stefan Marr

Possible second mentor: ?

Description

The various Smalltalk VMs are under constant development and maintenance. To avoid performance regressions, they need to be monitored and tracked with every change. The goal is to collect a set of standard benchmarks and set up the necessary infrastructure based on Jenkins/Buildbot and ReBench, SMark, and CodeSpeed.

Technical Details

The main parts of the infrastructure is already available as Smalltalk-independent projects. With ReBench and SMark, the Smalltalk community even got the bits and pieces for running benchmarks automatically. However, it still misses a setup that can benefit from the continuous integration servers to benchmark the evolving VM codebases.

Benefits to the Student

Get an insight in setting up and managing typical cross-technology projects. Get a better understanding of the pitfalls of performance evaluation and benchmarking on modern platforms.

Benefits to the Community

An additional quality insurance for core elements of the ecosystem, i.e. the VMs. Furthermore, an objective measure to compare different Smalltalk implementations.

 

A binding to R

Level: Advanced

Possible mentor: Hernán Morales Durand

Possible second mentor: ?

Description

With the increased popularity of high-level statistical programming language and environments for data analysis like R, a way to interface this package is becoming a must have for Smalltalk, since it implements complex and unrivalled statistical techniques for data manipulation and presentation, like Analysis of Variance, Covariance, Time Series, Generalized Linear Models, Additive Models, Non-linear Regressions, Tree Models, Multivariate Statistics, etc. besides the many mathematical functions, which are used in fields from economics to medicine and engineering. It is estimated that R posseses about 2 million of users worldwide and more than 2000 add-ons and increasing everyday through repositories like CRAN and Crantastic.org

Technical Details

The student should know or be motivated to learn statistics with the R environment and language, and its fundamental workflow: importing and preparing the data, and finally running the analysis, and presenting the
results. Dealing with R sessions and presentation of results (like vectors and plots) will be challenging too.

Benefits to the Student

The student will gain invaluable experience from two complementary environments, and his experience with the interface technology choosed will be useful for the many projects where Smalltalk needs help from external systems.

Benefits to the Community

The goal of this project is to build a wrapper to interface R, an open source statistical programming language, providing a whole range of missing functionality to Smalltalk. This binding could complement the R environment where a general programming environment is needed, attracting many statisticians, and will open Smalltalk to
domain-specific areas as diverse as Clinical Trials, Finance and Machine Learning.

 

 

User Interface

 

 

Port OpenQwaq video to Etoys

Level:

Possible mentor: Karl Ramberg

Possible second mentor:

Description

OpenQwaq support many video playback formats. Etoys support a few quite odd formats and this hinders use of video with Etoys. Formats such as H.264 is standard on the web and Etoys could playback and post on for example YouTube.

Benefit for the Student

Work with underbelly of both the VM and image to support good state of the art video playback / recording.

Benefit for the Community

We would be able to use and make videos from Etoys. Many sites have great educational videos posted. Being able to annotate and interact with this would make a great tool for students and teachers

 

Integrate Gezira into Squeak/Pharo

Level: Intermediate

Possible mentor: Jeff Gonis

Possible second mentor:

Description

Squeak started life as a highly graphical environment for exploring and experimenting with computation.  Part of it's appeal was that it was completely understandable from the ground up, including how every pixel on screen gets there. However, Squeak's graphical facilities have fallen behind the current state of the art, and are still largely focused fundamentally on manipulating bitmaps.  Gezira offers a state of the art vector graphics framework that is at the same time compact and understandable. It is being actively developed by the people at VPRI and already has a smalltalk implementation courtesy of Bert Freudenberg which is available from the VPRI website.

Technical Details

Take Bert's work done for VPRI and integrate it into the current Squeak VM. On the image side, integrate Gezira into Squeak's canvas hierarchy and, time permitting, begin working to retrofit current morphic rendering to use Gezira. Work to speed up Gezira in smalltalk could also be a part of the project.

Benefits to the Student

The student will learn about vector graphics, as well as the Squeak VM. It will also provide valuable experience learning how VPRI is going about creating software that is both functional and compact at the same time.

Benefits to the Community

Squeak and Pharo will gain a modern, understandable vector graphics framework allowing for graphical capabilities that are of higher quality than what is currently available.

 

Nautilus

Level: Intermediate

Possible Mentor: Stephane Ducasse

Description

Nautilus is a new browser based on the latests system meta model tools like RPackage or Ring. The goal is to ensure this browser has the stability and all the features required to become the next standard browser.

Technical Details

Nautilus may be improved on different levels: fixing last bugs, improve the way the UI widget are defined to be able to easily change the UI representation, create some plugins for metrics, better traits integration, better icons. Morphic may have to be improved.

Benefits to the Student

The student will learn and use the latests infrastructure tools, the different Morphic's layers. He will also participate to a tool which is used and will be.

Benefits to the Community

The community will gain a more stable and up to date default browser.

 

Extend Magritte 3 to create Naked Objects style interfaces

Level: Intermediate

Possible mentor: Stephan Eggermont

Possible second mentor: Esteban Lorenzano


Description

"What if you never had to write a user interface again? What if you could simply expose your business objects directly to the end user? How would this affect your productivity? The way you work? The flexibility of your
applications? Is this even possible? Sometimes, yes. This describes a style of application development, Naked Objects, where you write just the business objects, and a framework lets your users interact directly with
these objects." (Dave Thomas)

Technical Details

Using Magritte we can already build (web/glamour) components representing attributes of domain objects. What should we add to enable a NO UI?

  • (better descriptions of) relationships between domain objects;
  • descriptions of actions, with and without parameters;
  • build a canvas where multiple domain objects can be shown;
  • make the domain objects identifiable with an icon and title;
  • add drag-and drop to initiate actions and manage relationships;
  • add buttons/drop down menus for actions;

(and a few other things, of which I'll think when these are finished :)

Benefits to the student

  • Learn to use and extend a high quality framework.
  • Learn and practice interaction design.
  • Learn about hexagonal architecture.


Benefits to the community

The Naked Objects UI enables a very short feedback loop when prototyping a domain model. It provides a showcase for the superiority of smalltalk as a rapid development environment. It allows scaling down the engineering effort needed to create custom solutions. It fits very well with exploratory modeling.
In a mixed UI style it allows focusing effort on only the critical parts of the UI.

 

 

Web

 

 

Make web browser plugin of Squeak work better on all platforms

Level:

Possible mentor: Karl Ramberg

Possible second mentor:


Description

The web browser plugin for Squeak was made quite a long time ago and there are small issues with most platforms. 

Benefit for the Student

Learn the platform specific code for many platforms and their implementation.

Benefit for the Community

Better support for Etoys on all platforms.

 

ePUB Output for Pier Books

Level: Intermediate

Possible mentor: Nick Ager

Possible second mentor: Lukas Rengli

Description

Pier 1,2 is an /extensible/ object-oriented content management system that includes a book authoring engine. The book authoring engine has been used to document a number of Smalltalk based projects, most notable Seaside 3 as well as Moose 4 and Pharo 5.

The goal of this project is add ePUB 6 as an output format to Pier enabling existing and future Pier books and other Pier content to be output in a format readable by popular eReaders.

1 http://www.piercms.com/
2 http://code.google.com/p/pier/
3 http://book.seaside.st/
4 http://www.themoosebook.org/
5 http://book.pharo-project.org/
6 http://en.wikipedia.org/wiki/EPUB

Technical Details

Content in Pier is parsed into a document tree that supports the visitor pattern 7 for traversing the internal representation. Pier currently supports a number of output formats such as HTML, plain text, RSS, wiki
text. These formats are generated by visitors. The existing visitors would act as a template for creation of an ePUB visitor which would generate ePUB compatible output.
 
7 http://en.wikipedia.org/wiki/Visitor_pattern

Benefits to the Student

The project is well defined with a clear deliverable. The student will gain an understanding of the internals of a powerful, well structured content management system as well as the satisfaction of having a set of
artefacts in the format of the existing Pier book content translated into ePUB format.

Benefits to the Community

The Smalltalk community and others using the Pier book authoring engine will gain ePUB as an output format. With an ebook output format authors have a potential new revenue source, encouraging a virtuous cycle of
increasing documentation within and outside the community.

 

CSS Template System

Level: Easy / Intermediate

Possible mentor: Hernán Morales Durand

Possible second mentor: ?

Description

Cascading Style Sheets (CSS) is a technology which allows personalized presentation styles of document contents. It is based in textual source code which has influence on the visual style and presentation of HTML
and XML documents. This includes ways of assigning layout and style properties, such as font types and sizes, for different types of document components when they are presented. As having a long adoption rate from specification (CSS 1 : 1996, CSS 2 : 1998) to be fully supported by web browsers (2000 & 2006 for CSS2) it is predictable that CSS 3 modules like Backgrounds and Colors, Media Queries, and Multi-column Layout will come in no short time from now (2012), although many web sites today use CSS heavily. Through the years, the need for high-level presentation objects like grids and common layouts, leaded to the creation of CSS libraries called CSS frameworks to help web programmers in composing more attractive web sites in a rapid way.

Technical Details

The objective is to complete an initial CSS framework in Smalltalk (Phantasia) with CSS Template
objects. Phantasia avoids to learn CSS syntax, and with proper completion could obtain a rich object model of CSS objects beyond the specification limits, suitable to edition through Smalltalk code. High-level template objects could include navigational ones like lists, menus, or palettes, useful in wizards. Presentational objects like
template layouts, which are already implemented by external CSS libraries like Blueprint, and scale to whole CSS templates including reports, galleries, etc.

The model should be independent of web framework so it could be re-used in Seaside, Aida, Iliad or other future web frameworks.

Benefits to the Student

The student will learn how to build and extend a model for a limited textual technology (CSS), from the low level objects like CSS functions to high level objects like Positioners. The student could apply the results of his work to several popular Smalltalk web frameworks.
 
Benefits to the Community

The Smalltalk community working with web technologies will be tremendously favoured by having ready-made CSS objects to incorporate easily in their web applications.

 

Improving the Smalltalk GSoC website

Level: Intermediate

Possible mentor: Janko Mivšek

Possible second mentor:

Description

Our special website for Smalltalk GSoC (gsoc2012.esug.org and completed past gsoc2010.esug.org) has a public and administration part to help running the annual GSoC process smoothly. It helps students, mentors and administrators in all phases, from collection of ideas, preparing the projects, inviting the students, pairing them with the projects, voting, running the projects with the interim and final evaluation. This project is meant to improve both public and administration part of the website to ease work for everyone and to upgrade it to the most modern web technologies.

Technical details

Possible improvements: Facebook like timelime of personal page adding realtime updates (using WebSockets), notifications by email, more HTML5 support, improving the administration interface, collection of project
ideas by proposers directly, integration with the Google GSoC site and mentors and students mailing list, REST API, mobile app, ...

Benefit for the Student

Getting in touch with current bleeding edge of the Smalltalk web technologies with Aida/Web framework and Amber Smalltalk. Contributing to the community with valuable and immediately useful work.

Benefit for the Community

Well and easier run GSoC process, with a good website, up-to date with modern web technologies, is important for us to improve our visibility and have a showcase of our strengths on the web field to the broader open source community.

 

Persistency

 

 

External Image Database

Level: Intermediate-Advanced

Possible mentor: Bernat Romagosa

Possible second mentor:

Description

In Smalltalk, we often end up having to flatten our objects to make them fit into traditional databases. Be it SQL or key-value storage, the processes involved in re-mapping our objects are never trivial and most of the time end up forcing us to give up the comfortableness of dealing with live, consistent data as we are used to do in image-based persistence.

The idea behind this project is to set up an image that is capable of storing a real Smalltalk object model, receiving real Smalltalk queries from other images and returning the result of executing them in the shape of real Smalltalk objects.

Technical details

The proposed platform to build the system is Pharo. As for the implementation of inter-image communication, Fuel-serialized objects over sockets are a fairly easy to implement solution but maybe something faster can be thought. Pharo images have a 2 Gb size limitation, which makes them unsuitable for storing too much data, which means a solution has to be thought. Possibilities may include a disk-based storage system a la Gemstone or building a 64 bit capable VM.

Benefits to the Student

Exposure to non-traditional storage solutions, with a lot of freedom on design and implementation details.
Possibility to participate in a project that can have a great impact in the open-source Smalltalk community.

Benefits to the Community

A real Smalltalk object database that is free, both as in free beer and as in free speech. First steps in what seems to be a generalised future vision in the community: being able to easily build applications powered by a cluster of images, rather than big, bulky, single-image systems.




Updated: 11.3.2012