Software for humans

marți, 24 martie 2015

My experience with Btrfs on Debian - no space on device

I wish to share with you some of my experiances with BTRFS. Maybe it will be usefull.

I'm using btrfs on my /home partition, on my Debian laptop for a few months now. All seemed to be pretty good. The only issues that I had was the fact that filesystem access seemed to be a bit slow on some occasions, but generally things where ok.

This went on until today, when I noticed a complete slowdown of the system, to a point that it became unusable. Some symptoms: Google Chrome crashed and became unusable, removing files was very slow (rm -rf node_modules, nautilus deletion, etc. ) Intellij WebStorm was barelly starting, node.js was complaining about 'not enough space on device' .

I also checked my root partition to validate if the prbolem was btrfs related or not. My root partition, which is ext4 and I created/removed some files. Operations performed normally.

My system was thus unusable and I didn't had any idea why.

So I started investigating...

I checked btrfs file system size, but things where ok:

btrfs fi df /home showed about 10gb of free space

sudo btrfs fi df /home/
Data, single: total=229.80GiB, used=218.80GiB
System, single: total=32.00MiB, used=20.00KiB
Metadata, single: total=3.00GiB, used=2.49GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

My fstab looked like this (I had cow active):

/dev/sda3 /home btrfs autodefrag,compress=lzo,space_cache 0 0

Not knowing what to do, I started changing /etc/fstab and rebooting my laptop. First I activated noatime, then I deactivated cow (with nodataow) but I did not notice any improvement.

The solution:

I was very close to re-install Debian on my machine but decided to give it one more chance. I didn't like that thought, since it would have ment a lot of work. I seated myself at the keyboard and started a btrfs defrag

"btrfs filesystem defrag /home"

While I was at it, I thought it will be nice to have some more free-space so I started to delete a bunch of big files. That took a few minutes and I got some extra space and Evrika !!! my system was usable once more !

Data, single: total=229.80GiB, used=199.81GiB
System, single: total=32.00MiB, used=20.00KiB
Metadata, single: total=3.00GiB, used=2.49GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

My current /etc/fstab :

/dev/sda3 /home btrfs nodatacow,autodefrag,noatime,compress=lzo,space_cache 0 0

What caused this?

I'm not very sure. It's kind of obvious that I've hit the btrfs low disk space problem. The strange thing is that I've been having ~ 10 Gb of free space for a long time (more than a 1-2 months) and I did not notice that behaviour. After more thought of what I did different the laste few days is that I started using some node.js build tools that generate a lot of writes on change ( broccoli js build tool, ember-cli ). I will continue to investigate this, but right now I'm happy to have a working system and will also migrate my root partition to btrfs once Debian Jessie is released.

On the down side, it's a pitty that with btrfs you can do so much, but you can't use all your disk space, and you can make your system very unstable very easy.

$ sudo btrfs fi show
Label: none uuid: 49d9c741-b00b-487b-be5c-49da02325a16
Total devices 1 FS bytes used 202.30GiB
devid 1 size 232.83GiB used 232.83GiB path /dev/sda3

Btrfs v3.17

$ uname -a

Linux daos 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt7-1 (2015-03-01) x86_64 GNU/Linux

Update:
More 'scientific' artiles related to the subject:
http://unix.stackexchange.com/questions/174446/btrfs-error-error-during-balancing-no-space-left-on-device
http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html

vineri, 17 octombrie 2014

Modular REST applications with Karaf features for OSGi Jax-RS Connector

The purpose of this article is to let you know how easy it is to develop modular REST (JAX-RS) applications on OSGi, particularly Apache Karaf.

For some time I'm working on improving the way I deliver applications. My focus is on quality, ease of understanding and speed of delivery. My framework of choice for some time is the OSGi platform (mostly used on top of Karaf, but I'm getting bolder - going for bare-bone, native containers).

Regarding web applications I admit I don't like sessions and I am strongly inclined to develop stateless applications. Since I like standards and the benefits they provide, my choice for a web framework has narrowed down to JAX-RS for which there are a few implementations.

I came across a project called osgi-jax-rs-connector who's aim is to simplify web application development using JAX-RS on OSGi. The way it works is you write your JAX-RS annotated resources and you publish them in the registry. Once there, the JAX-RS Publisher from osgi-jax-rs-connector will find them, take notice of the annotation and publish them. That's it.

In the project README on github, you will find links to articles detailing the whole process.

All i did was to add a features file for Apache Karaf so you can try it out easily. I've made a pull request with my code to make it part of the original code base and hopefully it will soon.

I'll reproduce the steps below. You start by building the project and installing the features in Apache Karaf:

    feature:repo-add mvn:com.eclipsesource.jaxrs/features/0.0.1-SNAPSHOT/xml/features
    feature:install scr http
    feature:install jax-rs-connector jax-rs-provider-moxy
    install mvn:com.eclipsesource.jaxrs/jax-rs-sample/0.0.1-SNAPSHOT

After this just go to: http://localhost:8181/sercices/greeting

You can check the whole project on my github account in the mean time: step by step .

There are other solution out there for publishing JAX-RS resources using OSGi HttpService. Another interesting approach is Neil Bartlett's JAX-RS OSGi extender . The main advantage (in my opinion) of using the approach taken by Connector is the fact that you publish objects instead of the extender building them for you. This means that I am free to choose the way I build my object and I also have the opportunity to inject dependencies in it before I publish it - hello CDI. I can build my objects using CDI via pax-cdi or with declarative services (as you can see in my sample code) and I am free to inject stuff in it before I expose it for registration with HttpService. That is a pretty powerful thing. I hope to show you how this is done soon.

marți, 19 iunie 2012

java mbox parsing with Apache James Mime4j

Today I just made a small push to Apache James Mim4j trunk. The change-set adds mbox parsing capabilities to mime4j. It's a one class Iterator that you can use to split an mbox file into individual messages and after this parse them with mime4j.

I also added an simple example here.

This is how you can use it:

for (CharBufferWrapper message : MboxIterator.fromFile(mbox).charset(ENCODER.charset()).build()) {

System.out.println(messageSummary(message.asInputStream(ENCODER.charset())));
count++;
}

joi, 17 mai 2012

JETM performance monitoring for Apache James

The story

Today I decided to improve how people can monitor Apache James email server. James already has some monitoring available via:

Each of the above monitoring options is useful and necessary as it's used to monitor different aspects of how James runs. With regard to performance, only JMX monitoring can provide metrics that allow a user to get a feel of the server state.
With JMX enabled you can just launch jconsole to access the attributes and methods that James exposes: number of active connections active per component, limits, etc. You have complete monitoring access to the following services:

IMAP
POP3
SMTP
LMTP
RemoteManager
DNSService
Queues

The nice thing about JMX is that it's integrated with the JVM (no external libraries also) and can provide very precise information on almost every aspect of your application like memory, execution of methods, number of method calls, etc. You can see what James exposes on the Java Management Extension page. JMX is good so why another solution to monitor James? Well, in order for JMX monitoring to work you have to write some code and once you get it in there it's not easy to turn off. Since James is component based, disabling JMX is very easy: you just have to comment out some spring beans declarations. So the second part is not an issue, but the first one is.
Monitoring all James components means you have to write code for each component. This will make the code-base larger and harder to maintain. Here's where Java™ Execution Time Measurement Library or JETM shines. It's a general library that you can use in your application to do monitoring. It's small, compact, adds little overhead and provides all the basic stuff that you need, including an web interface to view your results.
The nicest part about JETM is that it has spring integration. Because James uses spring extensively (the IOC part) integrating the two was a matter of a few hours (took my time reading).

How it works

The JETM library has integration with spring framework and uses spring AOP. It creates a Proxy to intercept calls to the beans that you wish to monitor.

The result

and detailed:

The code

You won't find much here, just some configurations. You can see the full diff here.
In a few words: add the library to your project pom.xml file:


        <dependencies>
            <dependency>
                <groupId>fm.void.jetm</groupId>
                <artifactId>jetm</artifactId>
                <version>1.2.3</version>
            </dependency>
            <dependency>
                <groupId>fm.void.jetm</groupId>
                <artifactId>jetm-optional</artifactId>
                <version>1.2.3</version>
            </dependency>
        </dependencies>

Next declare the monitoring beans in your project (jetm-monitoring.conf in our case):


    <bean id="etmMonitor" class="etm.core.monitor.NestedMonitor"
          init-method="start" destroy-method="stop" />
    <bean id="etmHttpConsole" class="etm.contrib.console.HttpConsoleServer"
          init-method="start" destroy-method="stop" autowire="constructor"/>

    <bean id="etmMethodCallInterceptor"
          class="etm.contrib.aop.aopalliance.EtmMethodCallInterceptor"
          autowire="constructor"/>

    <bean id="etmAutoProxy"
          class="org.springframework.aop.framework.autoproxy.BeanNameAutoProxyCreator">
        <property name="interceptorNames">
            <list>
                <value>etmMethodCallInterceptor</value>
            </list>
        </property>

Here EtmMonitor is responsible for collecting and aggregating measurements points. The measurements points are method calls intercepted by EtmMethodCallInterceptor. The final step is to add the list of beans from your project that you wish to monitor (also in jetm-monitoring.conf). Here I added most of the beans exported by James.


    <bean id="etmAutoProxy"
          class="org.springframework.aop.framework.autoproxy.BeanNameAutoProxyCreator">
        <property name="interceptorNames">
            <list>
                <value>etmMethodCallInterceptor</value>
            </list>
        </property>
        <!-- add the beans that you wish to monitor to the list bellow -->
        <property name="beanNames">
            <list>
                <value>usersrepository</value>
                <value>recipientrewritetable</value>
                <value>domainlist</value>
                <value>mailrepositorystore</value>
                <!--
            The folowing beans require CGLIB2 to be on the classpath -->
                <!--
            <value>mailqueuefactory</value>
            <value>blobTransferPolicy</value>
            <value>jmsConnectionFactory</value>
            -->
                <value>jmsTransactionManager</value>
                <value>mailprocessor</value>
                <value>mailetcontext</value>
                <value>mailspooler</value>
                <value>mailetloader</value>
                <value>matcherloader</value>
                <value>filesystem</value>
                <value>dnsservice</value>
                <value>fetchmail</value>
                <value>smtpserver</value>
                <value>pop3server</value>
                <value>lmtpserver</value>
                <value>imapserver</value>
                <value>imapDecoder</value>
                <value>imapEncoder</value>
                <value>locker</value>
                <value>datasource</value>
            </list>
        </property>
    </bean>

Spring will do the rest (we import the jetm-monitoring.conf in spring-server.xml to bring it in context).

Final notes

You can do much more than what I've showed. You can export the metrics JETM collects via JMX and console, not just HTTP console. Please read the online docs for more details.
You can get the source code as example by checking out James App . After discussing this on James mailing list a number of other promising monitoring libraries turned up: metrics from Codehale and Servo from Netflix. They both deserve a look.

duminică, 4 martie 2012

Trying out (pending) James 3.0-beta4 HBase mail store - take 1

Today I decided to try the newest (pending) release of James : 3.0-beta4.
I have access to a VM cluster of machines running HBase and I wanted to see how things work, maybe do a small stress test.

Downloaded the binary release from the staging repo, unpacked and began configuring the James to enable the hbase mailstore implementation.

Changed a line in mailbox.conf to enable hbase mailstore provider, changed

import resource="classpath:META-INF/org/apache/james/spring-mailbox.xml"

from spring-server.xml to

import resource="classpath:META-INF/org/apache/james/spring-mailbox-hbase.xml"

and uncommented the mailbox declaration from spring-mailbox-hbase.xml to make the Hbase mail store implementation available to the spring context. After this I copied hbase-site.xml (and hdfs-site.xml) inside the James conf directory.

And with that last step James is configured to send emails to the Hbase cluster. All things done, I started James with bin/run and then.... Exception.

| james.mailprocessor | Unable to init mailet LocalDelivery: org.apache.mailet.MailetException: Could not load mailet (LocalDelivery);
nested exception is:
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'org.apache.james.transport.mailets.LocalDelivery': Injection of resource dependencies failed; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'hbase-mailboxmanager' defined in class path resource [META-INF/org/apache/james/spring-mailbox-hbase.xml]: Cannot resolve reference to bean 'hbase-sessionMapperFactory' while setting constructor argument; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'hbase-sessionMapperFactory' defined in class path resource [META-INF/org/apache/james/spring-mailbox-hbase.xml]: Instantiation of bean failed; nested exception is org.springframework.beans.BeanInstantiationException: Could not instantiate bean class [org.apache.james.mailbox.hbase.HBaseMailboxSessionMapperFactory]: Constructor threw exception; nested exception is java.lang.RuntimeException: java.lang.IllegalArgumentException: Not a host:port pair: �%32649-- removed hostnames-- net,60000,1329515254741
.....

Searched the cause of exception and found out that it's because of a version mismatch. Unfortunately the cluster is running hbase 0.92 and the client is running 0.90. The client serialization is incompatible with the one used by the server.

Aside for the fact that the exception was fatal there is an upside to this: James is trying to connect to the the cluster.

Just found out that HBase 0.92 is on Maven Central, but without tests so this will make upgrading impossible because of the unit tests which require it.

The tests jar will be available in 0.92.1 (as stated on hbase-dev). If that tekes too long I will build and test it with a private release of Hbase 0.92.

Hope things move a bit faster with 0.92.1 with respect to artifacts being available on Maven Central.

miercuri, 8 februarie 2012

NIO Iterator over messages in mbox file

I've started working on a small project called mbox-iterator that I wish to integrate with mime4j later, when it's more usable.

The idea is to provide an Iterator over all the messages in a mbox file and provide access to the raw data. You can then use mime4j to parse the message and do whatever. It's very good for use cases when you don't need the data to be processed and would like to do your own processing or just need access to the raw data.

The project uses java NIO Memory Mapped files to map the file into memory. This will use the OS to manage the memory and file loading for you you get a ByteBuffer that you can use to accces the date. We are not using this ByteBuffer directly, because it's bytes and we need character sets so we need to decode the bytes. We get use a Charset to get a CharsetDecoder for the encoding we need. CharsetDecoder returns a CharBuffer instance and because it implements CharSequence we can use a regex to determine boundaries between messages.

I was hoping that CharBuffer would share the memory with the MappedByteBuffer that we get initially but it seems that this is not the case. We have some memory copying here because the CharBuffer we get is an instance of java.nio.HeapCharBuffer. I was thinking we could have zero Java heap memory and use just the O.S. pages for holding the mbox data but as it turns out, the process of translating bytes to chars needs some memory to keep the chars.

It would have been nice to use asCharBuffer() and provide zero-copy access, but darn, maybe with a future Java version.

Let's return to our cattle:

When we find such a boundary, we return a slice()ed CharBuffer to that message. We also set position and limit so that we get just the message. This has the advantage of using the same memory as the CharBuffer ww do matching on and avoids unnecessary memory copy operations.

I have tested it on a small mbox (135kb) and performs ok. I'm planning more tests with a 2gb mbox. Using NIO memory mapped files we can map very large files and use the OS cache and buffer memory so we can avoid GC activity and unnecessary memory copy operations.

The following areas need improvements before the project is usable:

- testing for different types of mbox files
- better regex to match different mbox formats (mboxcl, mboxrd and the likes)
- better regex for matching From_ lines (now we miss some From_ lines that have MAILER_DAEMON instead of email address)
- support for mboxes larger than 2GB - use more ByteBuffers to map portions of the files, watch out for mails that spill over boundaries.

I'm pretty new to NIO so if you have suggestions of how to do this better I'm open to suggestions and pull requests.

You will fins a simple example that splits one mbox into individual messages in the project sources.

Happy hacking.

References and links:

[1] http://www.kdgregory.com/index.php?page=java.byteBuffer
[2] http://en.wikipedia.org/wiki/Mbox
[3] http://qmail.org/man/man5/mbox.html
[4] http://james.apache.org/mime4j/index.html
[5] https://github.com/ieugen/mbox-iterator

marți, 27 septembrie 2011

New job

A few days ago I joined Big Data Solutions Team at 1and1. They have nice offices in Bucharest, but most of all I like the team and the projects.

Besides the fact that I will continue to work with HBase and Hadoop I will also expand in using Lucene, Solr and Mahout. All Apache technologies :).

I plan to use the experience I gain to integrate more functionality in Apache James, first on my list is mail search over HBase.

Faceți căutări pe acest blog