You can use it at the main branch or latest Docker image.
]]>The most major change is internal one: ActivityPub library’s upgrading.
We also added changes to home page and timeline. See a blog post for details.
In addition, a problem that activities don’t reach other instances is solved. We should have fixed it, but other cause have remained. It should be away.
See the release note for details.
]]>As a part of that, I added a few changes to home page and timelines. I
By these changes, I think you will get free from unexpected article updates.
They are applied to this instance. How do you think?
]]>I have done an internal improvement I have worked with since Jan. 2022: Upgrade activitystreams to 0.7, again
It doesn’t affect features and user interfaces directly, but is a large change. So I will use it at my personal instance and see what will happen. You can also try it using Docker images’s ap07
tag or ap07
branch of source code. Note that it might have critical bugs that break federation with other instances.
Next, I will implement some tiny features and fixes tiny bugs. And then, I will work with Rocket 0.5, which should be also hard work.
]]>% plm migration run
This update added MAIL_PORT
environment variable support. See documentation for details.
Also it added sign-up feature. You can see how to set up at SIGNUP
variable in Useful environment variables page. A blog post might help you.
Now I’m working on Malfunction while creating a blog post in Persian issue and upgrading Rocket library to 0.5. I hope the next release includes fixes for them.
]]>For current implementation, when provided address is used for an existing user,
How do you think about this behavior?
A’s purpose is:
B’s purposes are:
Any comments are welcome. I want to release this feature as version 0.7.1 in several days if there is no problem.
You may try this feature today using the latest source code or Docker image. It requires running migration before restarting Plume:
% plm migration run
Setting SIGNUP
environment variable to email
enables the feature:
% SIGNUP=email plume
Thank you for using Plume. I hope you mamage Plume instances with comfort.
]]>I have finished migration. Thank you gled for maintainig the blog for long time!
]]>Now article’s URI is automatically modified by Plume to capital case: “article title” → “Article Title” and hyphens are inserted between some type of characters: “タイトル” → “タ-イ-ト-ル”. Sometimes this behavior is not preferred. So I stopped this behavior. Existing URIs won’t be modified. New articles’ titles will be applied.
If possible, it’s so appreciated for you to run the latest version from source code or Docker image and report issues, especially whether there’s trouble with federation between Mastodon servers.
This was reported long time ago. Thank you for patience until now.
We decided to release the next version v0.7.0.
The primary reason is that sometimes issues about federation with Mastodon are reported. We didn’t see detailed logs but current edge version of Mastodon requires signing some HTTP headers and Plume v0.6.0 doesn’t. The next version includes the fix for this problem.
Of course, we think frequent release is good, though it’s not easy for us. I hope we could develop more rapidly and release more often.
As said above, we will release the next version. Until Apr. 25, we have translation term. You may translate Plume UIs on our Crowdin project. Two languages were added: Sinhala and Urdu. Translations for them are welcome as well as existing langs.
See our translation page for details.
I’m aware some issues are remaining on Gitea and GitHub. Rust, a programing language in Plume is written, is getting more popular. Contributing to Rust is a chance to learn Rust :) Get involved if you’re intersted!
]]>Now some instances of Plume, Mastodon and so on receive Plume’s new posts, and others doesn’t. The cause of this is that some attempts to notify publishing timed out when Plume instance federates with many instances. That was fixed and now (at unreleased edge version) all instances should receive new posts.
The problem that nemu doesn’t open on iOS was finally fixed.
During the fix, I found that WASM also doesn’t work. cargo-web, which is used to build WASM, doesn’t work well for newer Rust and it is not maintained now. So we switched WASM builder to wasm-pack, therefore build process slightly changed. See documentation for details. Because of this, you need use wasm-pack instead of cargo-web. The documentation was also updated: Compiling from source
I made images in post card links.
I often tap those images and get remembered that those are not links every time. Now they behave like my expectation!
Marius reported that his instance has too many images even though it has a few posts. He and I started investigation and it seems related to federation.
I would like to do some refactorings.
unwrap()
and expect()
in our Rust code. They bring up hidden problems. I want to rewrite them to error structure.plume-models
now. I think they should be in plume
because they are service objects such as Rocket instance and mailer rather than models.I started working on email confirmation feature as explained at #636.
I’m also thinking about the next release. I set the milestone v0.7.0 with well communitation with Mastodon, introducing actor system and implementing multi-author blog feature. The last one is difficult to develop soon. So the next release will inlcude the former two improvements and some bug fixes.
]]>This is internal improvements. In the future, Mastodon requires (request-target)
in requests’ signature HTTP header. I added it.
Now I’m working on a problem that when publishing a post, it doesn’t reach some instances including Plume, Mastodon and so on. I had a hypothesis and could verify it’s correct. So, I think I will fix it this week.
Period! Have fun with reading, interacting with and writing blog articles!
]]>I finished putting edit links to the next of post titles.
I think this reduces authors’ frustration that they have to click many times to edit posts. But I’m not sure its design is good. If there’s better place for the links, tell me or send a pull request!
You can try it using latest Docker image, building repo’s main branch or on my instance.
After the previous post, @iamdoubz reported an issue on menu (thanks!) and @marek-lach is working on it. It became clear that we need testing environement for the issue.
@pwFoo finally succeeded to build Docker image including Plume’s musl builds! That’s great though they’re pointing there are some more works are needed.
During developing more edit links, I added tests for a request handler. This is the first tests for handlers. As I learned to how to do it, I became able to test Plume’s behaviors.
On the other hand, I feel necessity of structuring test fixtures. Advices (or pull requests ;) ) are appreciated.
I added logging settings to .env.sample
to suppress logs from external libraries. Logs are important. I can investigate some feredation issues with logs.
@Ana filed an issue that Mastodon will verify requests from other instances more strictly in the near future and Plume requires some modification. I’m working on it. Implementation is briefly finished but I’m struggling to run a Mastodon instance on my machine to check it. At first, I ran Mastodon by Docker Compose but after some DNS problems I decided to run it on host machine.
There was a question about backing up and restoring Plume’s data in Matrix room. Daniel answered with a great guide.
I feel we should write more guide but I find myself writing code rather than documentation…
Some people reported feature requests and issues. They’re appreciated. Thank you very much!
]]>main
branch or pulling a docker image plumeorg/plume:latest
. Here are some of them!
According to @ahangarha, it’s natural to read code blocks left-to-right in right-to-left documents in Persian. @FDB_hiroshima added dir="auto"
HTML attribute to make it work.
This improvement is applied to this fediverse.blog instance.
By an issue by @pullopen, I found Chinese translations were not embedded into Plume binary. I added translations for languages below:
Though some langs don’t have translations actually, I added all langs registered on Crowdin, a translation platform we use.
You can sometimes neither like nor boost articles, nor follow someone on remote instances now. This happen when target article’s title or writer’s name includes non-ASCII characters. This problem will be fixed in the next release.
This fix is applied to this fediverse.blog instance.
@marek-lach fixed an issue that menu is not opened even though you tap the menu icon.
I have neither iPhone, iPad nor macOS. I’m so glad if you try it and send feedbacks, though surely I will test it before next release (should be 0.6.1 or 0.7.0).
This was a long time refactoring. Riker is an actor system library for Rust. By actor model, we can get simple perspective of system archtecture keeping it easy to write asynchronous tasks. This should make Plume more stable and performant.
@meena repeated trial & error many times, and eventually wrote a design doc. That’s a great work. Thank you.
@dr-bonez sended a pull request to introduce proxy support and it was merged. You can run Plume and make it federate with the fediverse behind a proxy server.
I added tracing
logging library. I intend to use it for investigation of federation problems in the near future. We are often reported federation issues. Fixing it is so hard if we cannot reproduce the problem on our development environment. At that time, we need record logs on production environment, and tracing
should be a fundamental to help us well.
This is not our improvement, but what @pwFoo is attempting. Musl is a alternate libc for static link. By building Plume with it, you can get more portable binary and run Plume on variuos machines including Alpine Linux. See the issue if you’re interested in it.
We updated Rust version to nightly-2021-01-15
.
Some day suddenly building Docker image on Docker hub failed. To fix it, we need to upgrade some crates and update Rust version. If you build Plume from source code, newer Rust will be donwloaded automatically according to rust-toolchain
file.
I create the 0.7.0 milestone on our Gitea instance and added some issues there. Those are not promises but useful for me to get concentrated on current task without keeping other tasks in my mind. In addition, I think some of you have wanted to know what is being developed now, and the milestone provides the information.
I started adding edit links to the site of article title in home and blog pages.
Now you need click links three times to reach to draft’s edit page:
This improvement reduce it:
or,
I’m sure it’s useful to put edit links in dashboard and blog pages. But I’m not sure whether putting them in home page is nice. Feedbacks are welcome!
]]>We are proud to announce Plume’s new pre-release version 0.6.0! All Plume administrators are encouraged to update to this version.
You might remember that our previous post is on 0.4.0. What about 0.5.0? We just didn’t write it. See the release note for details ;)
Previous versions of Plume are vulnerable to a spoofing attack. You are encouraged to update Plume to 0.6.0. Roger Meyer reported this vulnerability and advised us. Thank you so much.
@FDB_Hiroshima added support for LDAP authentication. Check out our configuration page to see how to enable it on your instance. Once enabled, users registered in the LDAP directory will be able to authenticate on Plume with their LDAP credentials.
@KitaitiMakoto (it’s me ;) ) made tags stay as-is: In previous versions, when you add a tag Hello World
, Plume automatically converted it to HelloWorld
(a whitespace is removed). Since 0.6.0, this auto-conversion is disabled.
As an invisible improvement, @marek-lach accelarated mobile menu opening transition. Could you recognize it? This is a CSS work and introduced to the default theme. You might need implement it for yourself if you use a custom theme.
As one of multi-lingual supports, a web font for Arabic script was added to the default theme. @DearRude requested this and @theMasix did it.
Other multi-lingual improvements are ones for right-to-left languages: These include a fix on title position and content direction. Reported by @quentin and @ahangarha, fixed by @FDB_hiroshima and @KitaitiMakoto.
You might find you couldn’t switch to the rich text editor. The problem was fixed.
Updating Plume to 0.5.0, you might experience a trouble about search index. 0.6.0 fixes it automatically as possible. You never see the problem hopefully.
Other people reported or fixed issues and developed Plume. Thanks a lot! We cannot mention everyone, but you can see more details following links from 0.6.0 release page.
As mentioned on the official page, now Plume development is slowing down. It might be good to use other software such as WriteFreely and WordPress’ ActivityPub plugin.We will continue development, but as each member has less time to dedicate to Plume, we appreciate if you get involved!
We moved repositories to a Gitea instance, although you can checkout source code from, report issues at, send pull requests to GitHub as always.
Now we’re working with async-ize being led by meena. It includes
This should improve performance and stability.
I (KitaitiMakoto) am interested in improving and fixing bugs of federation. My first TODO item will be to investigate how we can follow Mastodon accounts, and vice-versa.
Also, UI improvements are important. @freyja_wildes found and likes Plume, and is working on liking and boosting Plume articles without a page transition. Writing posts in Plume should be fun! For that many more work-flows need to become seamless. For example, uploading images, on which I will be working for the next release.
We, of course, welcome any feature requests, bug reporting and pull requests. We will happily work on them with you, so come join us in our Matrix chat.
We want to thank everyone for reading and writing on Plume. A special thanks to everyone running an instance! And thank you all for reporting bugs and requesting features, and for discussing and developing Plume!
]]>First of all, there was a number of quite important changes on the organizational level. Indeed, we decided to adopt a Code of Conduct, to make sure contributing to Plume is a pleasant experience for everyone. If you are curious, you can read it here (it is based on the Contributor Covenant Code of Conduct).
We also opened a Loomio group to discuss non-technical issues. Loomio is a tool designed to discuss and vote on specific topics. It only requires an email address to participate, and we would be happy to welcome you to our debates. If you wante to help Plume, but don't have much time or lack technical skills, participating on Loomio is a great way to contribute.
Among other things, we used Loomio to choose the new Plume logo, and after a poll, @trwnh@mastodon.social's proposition was chosen. It will now be displayed by default on all the instances and used for branding on other websites (Matrix, GitHub, etc). However, we plan to integrate other logos as well (so that the other proposition aren't lost) and we will add an option for you to choose the one you'd like to use on your instance.
Plume also has a website to introduce the project, host its documentation and contribution guides (this last page is based on Funkwhale's page).
But what about Plume itself? As you can imagine, since the first alpha was published, a lot of changes have happened.
Let's start with the bug fixes. There was a lot of them (we are still in an alpha phase after all), but the most important ones are probably the following:
Most of theses changes are not visible (because it works as expected now), but will noticably improve your experience with Plume (as well as your security).
There was also some design updates: @dfeyer@social.ttree.ch improved the contrast of input fields, and improved the title's style so that they are readable even when wrapped on multiple lines (and he started to work on more important design updates, that may come in the next release). Plume should also work better on small screens now.
On the federation side, @KokaKiwi and @FDB_hiroshima committed quite a lot of bug fixes. I (@Ana) also made a lot of changes to the code that handles incoming activities, making it easier to maintain and more compliant with the ActivityPub specification. The federation should be a lot more reliable now! @FDB_hiroshima also implemented two very important security and privacy related features: signature verification, to make sure the activities we receive from other instances are indeed coming from these instances, and blind key rotation. The NodeInfo, that gives information about a Fediverse instance for websites like the-federation.info, was also updated by @0x1C3B00DA@edolas.world to support both the version 2.0 and 2.1 of the standard.
@mareklach@mastodon.xyz also made a lot of changes to the phrasing used in the interface. Thanks to him, the English translation of Plume is now nicer to read. The term "follow" was also replaced by "subscribe" which is less person-centric and more adapted in the context of blogs.
The translations were also updated: Plume is almost completely translated in ten languages now!
I want to thank @alangarciar, @amikigu, @ardydo, @khannaankitaa, @ButterflyOfFire@mstdn.fr, @Devil505, @faho, @fitoshido, @FDB_hiroshima, @heldergg, @ida27, @Lacrymology, @m4sk1n@101010.pl, @ManuelFranz, @mareklach@mastodon.xyz, @metalbiker@ins.mastalab.app, @Moutmoutausore, @netopyrr, @ryonakano, @Silviu200530, @Swedneck, @UniqueActive, @wilPoly and @xosem@mstdn.io (that's a lot of people!) for their help. By the way, we are now using Crowdin to translate Plume, which makes it much more convenient when you are not familiar with GitHub.
Of course, we implemented a bunch of new features these last months! Among them:
A new editor, that should be more pleasant to use was also introduced! It is for the moment quite basic, but we plan to add a lot of features to make writing articles with Plume easier, even if you don't know Markdown.
We also improved integration with other services. Microformat tags were added making it easier for external apps to display Plume articles in their interface. RSS feeds are also more discoverable now (a little link was added on blog and user pages), and a bug that broke them in some case was fixed. OpenGraph metadata were also added, so that when you share a Plume article on another website, a nice preview card can be displayed.
The comments improved a lot too. It is now possible to delete them. We also correctly handle comment visibility from other instances, even if it is not yet possible to change it when commenting from Plume. Content warnings (as they are called in Mastodon) are also now supported in comments.
We also added features that will please people hosting an instance. First of all, thanks to @hirojin@dev.glitch.social it is now possible to use SQlite instead of PostgreSQL as a database engine. We also replaced the not so well working "setup script" that asked you questions to help you setup your instance with a command line tool, called plm
, that has various commands to manage your instance. The default license for new instance was also changed to CC-BY-SA, which is a better default than CC-0. And finally, it is possible to choose which logo to use for your instance from the configuration.
A REST API to create articles was also added. We built a command line tool, called amsterdam
(thanks @Doshirae@social.wxcafe.net for the great name) on top of this new API to import posts stored on your computer.
Finally, maybe the biggest feature in this release, a search engine was added to Plume!
It has been implemented with the amazing tantivy
crate, which is working very well and barely consumes any resources. If you want details about the technical aspects of this feature, you can read this article.
On the technical side, a lot of changes happened too. Thanks to @kemonine@social.holdmybeer.solutions and @eliotberriot@mastodon.eliotberriot.com our Docker files are working great, even on ARM, the Docker image became smaller and is now published in the Docker registry (and in Lollipop cloud's one for ARM builds).
The styling is now done with SCSS, which is much more convenient than regular CSS.
We also added some constraints at the database level (there were almost none before), to avoid saving invalid data.
We try to take advantage of Rust power as much as possible. For that purpose, we reworked our error handling to make it use the powerful Result
type instead of making Plume crash on error. We also converted our front-end code from JavaScript to Rust thanks to WebAssembly. It means we can now easily share code between the back-end and the front-end if needed. Our templates have also been rewritten using ructe
, which is a crate that compiles your page templates to Rust functions, guaranteeing their correctness before Plume starts, and improving performances.
A feature that recently landed in Rust is what we call proc-macro
. They are special functions that can transform some code to something else in very powerful ways. I used this feature to create a few proc-macro that automatically generate translation files and use them in Rust apps. If you are curious, here is the repository.
We also started to write more tests, and even if the coverage is still very low (around 30% at the moment), we are making progress, and I hope we can be near to 100% of tested code for the first real release. @FDB_hiroshima also migrated our Continuous Integration to Circle CI, which is much more faster than Travis CI.
If you want to give this new version a try, Fediverse.blog is already up-to-date. Otherwise, just give some time to your instance admin to update. 😄
If you are an instance admin, and you want to update or install a new instance, please refer to the release notes on GitHub.
Even if we made a lot of progress, tons of features are still missing for Plume to be really usable. I can't tell for sure what will be next, but I would like to work on multi-authors blogs and articles, and on moderation tools, because the one we currently have are too basic to be really useful. @hirojin@dev.glitch.social is also working on custom domains for blogs, so that may be in the next update!
However, I don't know how much I will be able to contribute in the following months, because my exams period will start soon, and I will have less time to give to Plume. Fortunately, I'm not the only person working on Plume.
I probably forgot a lot of people in this article, and I don't have everyone's real @, I'm sorry. I would like to thank everyone who made this release possible (even if your name was not cited here). Thank you! 💜
]]>There was an issue on Plume, issue 149, about adding some search capabilities to Plume. I've never worked on something like that, never touched ElasticSearch or anything, so I through it would be cool to discover how such things worked.
The first think to do was to go on crates.io to see what crates (libraries) could help me.
Doing some quick sorting, I could already eliminate most of the result. Some because they're tied to a database back-end we might not provide, other because they're not documented enough. In the end, I choosed Tantivy because it was the only one which was well documented, not a client library for another software, and actually feats to our need.
After digging quite some time in Tantivy's documentation, and in it's examples, I made myself a good idea of how it works and what it was capable of. As one would do with a SQL database, I started to search what I would store, and how. Here is what it got me to :
fields:
post_id => i64(id) -> Stored
creation_date => i64(date) -> Indexed
instance => i64(id) -> Indexed
author => string -> Indexed | IndexRecordOption::Basic > ContentTokenizer
hashtag => string -> Indexed | IndexRecordOption::Basic > ContentTokenizer
mention => string -> Indexed | IndexRecordOption::Basic > ContentTokenizer
blog => string -> Indexed | IndexRecordOption::WithFreqsAndPositions > ContentTokenizer
content => string -> Indexed | IndexRecordOption::WithFreqsAndPositions > ContentTokenizer
subtitle => string -> Indexed | IndexRecordOption::WithFreqsAndPositions > ContentTokenizer
title => string -> Indexed | IndexRecordOption::WithFreqsAndPositions > ContentTokenizer
license => string -> Indexed | IndexRecordOption::Basic > PropertyTokenizer
lang => string -> Indexed | IndexRecordOption::Basic > PropertyTokenizer
Tokenizers:
PropertyTokenizer => NgramTokenizer(2..8, prefix_only=false).lowerCase:
lang, license
ContentTokenizer => SimpleTokenizer.limit(40).lowerCaser:
author, blog, content, hashtag, subtitle, title
Pretty much all is here, fields names, their types, how they should be managed... but this is still a first sketch, at this time I haven't played with Tantivy yet, and things might still change. Let me explain a bit this the choices I made. First lets start with types, Tantivy support 4 :
With each field came some options, allowing us to tell Tantivy how to store and index things. The first option is it being stored. A stored field can be retrieved after searching for a document. Plume already store it's content in a database, so we mostly needs to get the id of a post, we don't want to store every post in double, that would be a wast of space. Then, the question is it indexed? Most fields are, because we want to search through those index, but post_id is not, because no one want to search for a post they already found. Edit: after all post_id need to be indexed as this is the key we use when we try to delete or edit a post.
With strings come some more options, what should the index contain about them? Only what words are contained, or also their frequency and position? Storing position for hashtag is not of much use, we don't put them in a nice order. Nor is frequency as there is usually no hashtag twice in a post. On another hand, searching for an expression is quite useful in a title or a full post. Then how do we tokenize our text? The simple, but not most efficient way, to tokenize text would be to simply split each word apart. This could be enough, but we could do better, and I actually want to play a bit with Tantivy, so I searched what I could do more. For licenses, their are single words, so we could index the the easy way and it would be enough, but I wanted someone who search for Creative Commons content to also find CC-0, CC-BY... so I used the n-gram tokenizer, which transform a word in every n-gram it is composed of. "CC-0" would therefore become ["CC", "CC-", "CC-0", "C-", "C-0", "-0"] if we consider 2-grams, and would then be matched by "CC". We then lowercase the result so "CC" and "cc" are equivalent.
For post content, being able to search any n-gram would however generate a lot of data to store, we need to use something else. The something I found is called stemming. The idea is to extract the stem of each word, so that "random" would match "randomly". Without stemming both a considered totally different tokens, and wont match each other. However stemming rules are very language dependent (the same in french would be "aléatoire" and "aléatoirement" for instance, the suffix is "ment" instead of "ly"). Tantivy provide a built in stemmer for English, but not for other languages, so I decided to not use this yet. Maybe latter. Another feature is stopwords, it stop some words from being indexed, because they are of no use. For example "the" is a very common word in English, searching for it has not use. But stopwords can cause some issues. Maybe a very common word in a language is also a very rare one in another. To make the mater worse, sometime common words mean something special. Think "The Who", a famous rock band, using stop words, it's likely both "the" and "who" would get stopped, and no one would ever find that band again.
So now that we know a bit about Tantivy, it's time to actually write some code. First as it's the first time I use this crate, I'll do some tests, with the above configuration, but outside of Plume, just to discover how everything works.
After playing a bit with Tantivy, I've already learnt some interesting things about it's internals. I started with the basic_search example, that I modified to my needs. First to used my own schema that I defined above, and tested a bit how query worked. Then as I was really interested in stemming, I searched how to implement a TokenFilter, and basically copied LowerCaser, and added some println
to understand how it was called.
The struct we create ourselves is, well, created only one time by ourselves. It's transform
function is called on each document we add, and create a new TokenStream. This TokenStream then iterate over each token in the document, modifying or removing the ones it want. For queries, the TokenStream see one token per word, or one token for multiple words if they are quoted.
So it could be possible to put a sort of header to each document indicating it's language, and to stem differently based on that. For queries, we would then have to either test each language or find the language of a query. Although it won't be included in the first try, it's good to know.
Also, according to the documentation, the QueryParser is bad at running queries which are not properly formatted. This will be fixed soon as there is a pull request trying to add a lenient mode to the parser. But for now we will need to make some sort of adapter, getting a query from a user and transforming it to a Query by our-self, or into a query string that Tantivy will understand every-time.
So now that we have explored a bit how it works, lets start playing with it inside Plume. As a public interface for the module, we will need some kind of builder, something to add, edit and remove documents, and obviously something to do searches. We also want some way to regenerate the whole index via a new plm sub-command.
The first thing to do is to create the module, it's what I've done in this commit. There is nothing really fancy, adding Tantivy to the dependencies, also adding whatlang to extract language from content, and also itertools because it make life easier when doing certain things with iterators. I've made two function to create an index, one only reading an existing one, and the other creating it. Plume should only open en existing index, and plm, once implemented, will be used to create it beforehand. Well actually there is a thing a bit fancy. Date storage. Tantivy allow for int storage, which is obviously what I went to. But I did not store date as seconds since 1970, the famous Unix time, because Tantivy's documentation state that RangeQuery, which we need to search for document created between two dates, iterate over the whole range. Iterating over millions of seconds is definitely slower than iterating over the 30 days of a month. So I decided date would be stored as number of days since a particular point, to be more efficient. I also made the choice to commit to Tantivy's store on every post creation/deletion, which turned out to be quite slow. Tantivy seems to prefers large batch jobs over single document ones.
Then in another commit, I added the command search to plm, with two sub-command, one to create a new index, and the other to read all posts from Plume database and copy them in the index. I also added calls to the function I created in the last commit, so that creating a new post, or deleting a blog, reflect accordingly into Tantivy's result. This required me to change the prototype of many functions, but wasn't actually a big work. Rust compiler just kept yelling at me "I don't have enough argument for this function", but once it shutted up, indexing seemed to be working. I also added a commit()
function to my Searcher, which would simply call Tantivy's one and removed commit on each edition. That way I could just commit regularly, with a thread waking up half an hour or so. Changing prototypes of function broke tests as I forgot to update theme, but this was quickly fixed.
So now I needed to call commit()
regularly, but our current thread pool only allowed to run task as soon as possible, and could not schedule one. I didn't feel like running a new dedicated thread to auto commiting, so I decided to move to a more advanced threadpool.
Next thing was to finally do some search. At this point it looked like it was working, doing some computation, writing result to disc... but I haven't done a single search yet. So I made a basic front-end with the base minimum, and used Tantivy's QueryParser to make my first queries. And hooray all seems to be working. Even paginated search. At this point this is working, doing advanced queries is a bit of pain, but it work.
However there is still at least one big problem, Rocket (the web framework we use in Plume) does not support clean shutdown. Tantivy's index writer get a lock on the index directory, and is supposed to release it when it get dropped, but with Rocket State owning it, this drop never happen. Leading to an error every time you start Plume after having killing it, even with ctrl-c. This was fixed in the next commit, where I added a drop_writer function to Searcher. After calling it, any try to write to the index will panic. So this is called on process termination, just before calling exit. There might be a race condition where one could edit a document after the writer got dropped, but before Plume exited, however this would only mean it's post won't be indexed, and as I see no easy way to fix it I will leave it that way.
The next commit is quite big. It contain some refactoring to avoid code duplication, some field where changed of type because after reflection, indexing the instance name into Tantivy is easier than indexing the id, and going to an SQL query to map one to the other. I also created my own tokenizer, which is a copy of SimpleTokenizer, but which don't cut token on punctuation, as instance domain contains dot, and user name can contain underscores or other non alphanumeric characters. This commit also contain a query parser, that is useful mainly because it never fail. Which don't mean it's better than Tantivy's one, it's just way less generic, so it can make assumption Tantivy can't. It also convert date range from the string a browser send (YYYY-MM-DD) to our own internal representation. I also modified the front-end to match with the back-end, having a basic search field, and many advanced field hidden if you don't open the spoiler. This is pretty much the final template, it's not exactly beautiful, but it's functional.
Before merging I decided to split searcher.rs into multiple sub-modules as the code is quite dense. There is not that much lines, but I used macros a lot to reduce duplication, and that does not help with readability. I also added tests to the parser, fixed some issues with it, and made it more intelligent when you search for post from an author of another instance, and that's it.
To conclude I'd say this was very interesting. Tantivy is a very powerful crate, having took a quick look at it's internal, this is also a complex software, but the interface it give is not that hard to use. Most difficulties I encountered were because I decided I wanted to write a query parser matching our needs, but one of Tantivy's example manage to do most I've done (with a smaller schema and using Tantivy build-in QueryParser) in like 70 lines of actual code. Thanks a lot to Fulmicoton for this library, this was a pleasure to discover full text search with it.
Now if we want to make even better use of Tantivy in Plume here is a short list of things we could do: