[:cs]Optimalizace obchodního systému[:en]Investment strategy optimization[:]

[:en]An investment strategy is basically a list of exact rules (algorithm), which defines investment decisions (buy and sell orders), can analyze and manage real-time risks based on current market situations. Investment decisions can be divided into two groups – entry rules for opening investment positions and exit rules for closing investment positions. Our research department uses a method which takes these two rule categories and applies different development attitude on each of them.

Standard old-fashioned approach

Standard old-fashioned development approach is based on using financial experts‘ rules what depends on experience, knowledge and personal attitude. These entry and exit rules are usually very subjective and may lack a deeper analysis of various datasets (news, fundamentals, technical, psychological). However, it doesn’t mean that an investment algorithm containing these rules is wrong and can’t generate profit in a long-term period. It can be very robust and stable.

New era approach

Our research department loves new technologies which can help to deliver higher and more stable performance, lower drawdowns with a shorter recovery period, better Sharpe ratio and other key performance metrics. Current research is focused on implementing machine learning and genetic programming to investment strategies – specifically to improve the effectiveness of the exit rules. We use a few simple entry rules defined by a financial market expert (old-fashioned approach), but exit rules are generated by modern technologies for deep analysis of different datasets. Exit rules contain simple or sophisticated patterns founded by genetic programming and machine learning, but this whole approach is still under expert control and each rule is validated with strict conditions to avoid over-fitting.

Customers will be able to benefit from these algorithms by using them directly on their own accounts through specialized trading platforms or by investing in the big funds which will use these algorithms.

Vision

Our vision is based on improving human skills with data analysis technologies and deliver higher and more stable performance. We are not trying to take-over human decisions, but we are trying to make them better.

If you are interested in more information stay tuned to our website or get in touch! It would be a pleasure to invite you for an meeting at our office.

Michal Dufek[:]

Stock Screener

[:cs]Jak jsme Vás informovali v našich předchozích příspěvcích, náš tým se nezabývá pouze state-of-art technologiemi a researchem obchodních strategií, které využívají hedgové fondy. Do našeho portfolia patří i běžné a relativně jednoduché reportovací a analytické nástroje. Jedním z těchto nástrojů je intuitivní stock screener, který můžete využít k jednoduchému seřazení “nejlepších” titulů dle Vámi navolených preferencí.

Další stock screener?

Většina stock screenerů je založena na tom, že máte k dispozici určitý universe (množinu) aktiv, který prostřednictvím filtrů (které máte k dispozici) prosíváte, dokud Vám nezbydou taková aktiva, která vyhovují předvoleným kritériím. Uživatelskou nevýhodou takového workflow je ten fakt, že potřebujete přesně vědět, co hledáte. Valná většina “hledajících” uživatelů ovšem v počátečním okamžiku neví přesně, dle jaké metodiky svoje výstupy hledá.

Náš stock screener tento fakt respektuje a řeší jej pomocí dvoustupňové klasifikace výstupu.

Jak to funguje?

V našem screeneru nejprve intuitivně volíte priority (technicky vyjádřeno se jedná o filtry), díky kterým vyjadřujete svoje preference (měna investice, riziková averze, investiční horizont apod.). Díky těmto preferencím nadefinujete metodu hodnocení skenované množiny aktiv, a tedy v důsledku Vašeho výběru dojde k nastavení hodnoticího modelu, který škáluje jednotlivé assety do žebříčku.

Black box, o kterém nevím, co uvnitř dělá?

Na výstupu dostanete seznam akcií, které i) vyhovují Vašim prioritám a ii) jsou seřazeny “od nejlepšího” dle hodnocení modelu, který reflektuje Vaše preference. Jednotlivá kritéria obsažená v hodnoticím modelu uvidíte ve výstupní tabulce spolu s ohodnocenými assety.

Nemůže se tedy stát, že byste nevěděli proč a jak model k výsledku došel. Hodnoticí model není pouze transparentní, ale je také libovolně upravitelný prostřednictvím nastavení indikátorů, které komplexní hodnoticí model utváří. Tuto funkci mohou ocenit zejména odborně zdatní analytici, kteří si dle vlastního úsudku přejí určité technické indikátory preferovat, nebo naopak diskriminovat.

Pokud Vás zajímá více o této aplikace, sledujte náš web nebo se na nás přímo obraťte.

Michal Dufek[:en]As we informed you in our previous posts, our team is not dealing only with state-of-art technologies and research of business strategies using hedge funds. Our portfolio is also complemented by common and relatively easy reporting and analytical tools. One of these tools is an intuitive stock screener that can be used for easy sorting of “the best” titles according to your selected preferences.

Another Stock Screener?

Most of the stock screeners are based on the fact that you have a certain universe (group) of assets that is refined through available filters until assets which match the selected criteria are found. User disadvantage of such workflow is the fact that you exactly need to know what you are looking for. However, in the beginning, the vast majority of “searching” users does not know precisely which methodology to use to search for their outcomes.

Our stock screener respects this and solves it by using the two-stage output classification.

How Does It Work?

In our screener, first of all, you intuitively choose priorities (technically speaking – filters), thanks to which the preferences are expressed (investment currency, risk aversion, investment horizon etc.). With these preferences, you define the method of the evaluation of the scanned asset set and, as a result of your choice, the evaluation model is set up, scaling individual assets to a chart.

Black Box I Have No Idea What Is Doing Inside?

On the output, you will be given a list of shares which a) meet your priorities and b) are sorted “from the best” according to the evaluation model which reflects your preferences. Individual criteria included in the evaluation model can be seen in the output table together with rated assets. Therefore, there is no chance of not knowing why and how the model reached the outcome. The evaluation model is not only transparent but also freely adjustable via indicator settings which form this complex evaluation model. This function can be appreciated especially by professionally competent analysts who want to use their own judgement to prefer or, on the contrary, to discriminate certain technical indicators.

If you are interested in more information about this application, stay tuned to our website or reach us directly straight away.

Michal Dufek[:]

Trading System Generator

[:en]

Intro

Once upon a time there was a term Artificial intelligence defined. It’s subcategories emerged soon after, some of them were inspired by nature. Be it reimplementation of evolution by natural selection proposed by Charles Darwin and Alfred Russel Wallace or deep learning inspired by the structures found in our own brains. Nowadays most of its implementations are tight to computational resources of universal processors, graphical cards, FPGA and ASIC circuits.

Genetic programming algorithm

We might have started from the easier target it seems. Genetic programming is a technique generating computer programs using evolution based algorithm.

As the start of a run the evolution algorithm is generating a population with set number of randomly generated individuals. Each individual is represented as a list of specific number of trees.

Each tree in the individual represents specific trading strategy behavior – be it asset selection, entry condition, entry filter, stop loss, exit condition, exit filter. Six trees in every individual in this specific case.

Each individual is run as a trading strategy. It might generate some trades then statistics are calculated. Various statistics can be used for the individual fitness. Right now we are using sharpe ratio multiplied by number of trades to push the evolution algorithm to prefer individuals with higher number of trades.

Based on fitness the evolution algorithm selects individuals for mating and mutating phase. Mated/mutated individuals are made part of the population and the individual’s fitness is calculated again. The best individual is kept in the hall of fame.

This iterative process goes on and after several generations we are expecting working trading strategy in the hall of fame.

Practicalities

“Slowness” of Python is negligent as most of the time is spend calculating an individual fitness. Our individuals are creating signals – genetic programming trees are made  with numpy and talib optimized functions (C code), the trading algorithm is represented as fully optimized state machine in Cython. In another words from programming point of view it’s mixed vectorized and event based approach to trading system programming.

Possibilities

By changing of fitness function we can push the evolution algorithm to generate specific trading strategies. Be it specific risk profile, maximum profit or optimized portfolio. There is a possibility to optimize just part of already existing strategy – for an example we will be taking an existing strategy with defined entries and the genetic programing will be evolving strategy exits. Additional inputs are possible to use as well – sources of sentiment information, market and intermarket indices, deep learning networks with pre-trained alphas and well known market alpha signals, etc.

Interesting thing with genetic programming and big amount of alpha factors might be ability to select ones which really matter in the market without being tight to usual evaluation within specific number of days to the future.

Miloň Krejča[:]

[:cs]Strategie pro obchodování komodit[:en]Commodity Trading Strategies[:]

[:cs]Research & Development team se aktuálně zaměřuje na vývoj obchodních strategií zaměřených na futures kontrakty pro ropu, zemní plyn a zlato. Tyto komodity se obchodují na centralizovaných amerických burzách NYMEX a COMEX a patří mezi nejvíce likvidní, přičemž se denně realizují transakce za miliardy dolarů. Vysoká likvidita a zájem ostatních obchodních subjektů vytváří vhodné prostředí pro implementaci krátkodobých momentových strategií využívajících neefektivity tržního prostředí. Vzhledem k povaze těchto komodit jak z hlediska potřebné volatility, tak i z hlediska struktury samotného finančního instrumentu, který je derivátem využívajícím finanční páku, je vhodné zařazení těchto komodit do výzkumného portfolia.

R&D team vyvíjí a optimalizuje obchodní strategie vycházející z rule-based konceptu, které vykazují pozitivní performance a s určitou pravděpodobností dokáží předpovědět vznik nového krátkodobého cenového momenta.

Implementací moderních technologií pro analýzu performance strategií a souvisejících rizik je prováděn exaktní a přesný výzkum, který dokáže rigorózně otestovat danou strategii. Oproti “zastaralým” technologiím dokáží ty moderní implementovat reálnou exekuci obchodních příkazů založených na skutečně realizovaných obchodech. Výzkumník tak dostává výsledky performance strategie téměř totožné s reálnou exekucí obchodní strategie. Právě při aplikaci zastaralých technologií, které neprováděly simulaci reálného tržního prostředí, docházelo často k výrazně odlišným výsledkům navržené a optimalizované strategie vůči reálné aplikaci.

Vývoj a optimalizace strategií pomocí moderních technologií je jeden z nosných pilířů celého projektu.

Po dokončení vývoje budou strategie pro obchodování ropy, zemního plynu a zlata zařazeny do portfolia, které bude řízeno obchodním konceptem metastrategie založené na chytré alokaci prostředků do jednotlivých strategií.

Myšlenkové workflow:

  • pracovali jsme s futures contract pro ropu a zlato
  • využívali jsem nástroje Zipline, Pyfolio, Pandas, MetaTrader, NumPy
  • řešili jsem momentové strategie zaměřené na krátkodobou spekulaci
  • inovací vůči stávajícím strategiím je využití moderních analytických nástrojů pro analýzu a minimalizaci rizika (Pyfolio)
  • tento výzkum cílí na brzké nasazení strategií v reálném tržním prostředí a následně bude navazovat optimalizace pomocí genetických algoritmů zaměřená na metastrategie. Tato strategie bude jednou z části početného portfolia řízeného metastrategií

Jan Budík[:en]Research & Development team is currently focused on development of trading strategies applied to Crude Oil WTI, Natural Gas and Gold (futures contracts). These commodities are traded on centralized exchanges NYMEX and COMEX which are located in USA and each day trading activity exceeds billions of dollars. Very high liquidity and interest of other market subjects create a suitable environment for short-term momentum trading strategies. It is appropriate to include these in the research portfolio due to the necessary volatility and the structure of the financial leveraged instrument itself.

R&D team develops and optimizes trading strategies on rule-based concepts that has positive performance and can predict upcoming short-term price momentum with sufficient probability.

Implementation of new technologies can help R&D rigorously analyze the performance of trading strategies and related risks with highest exact level. Against the “old” technologies, the modern ones can implement real-time execution of trading orders based on all realized trades in the selected time period. Application of old technologies for trading strategy performance analysis often lead to unreliable statistical results and trading strategy with great historical performance often fail in the real market.

Development and optimization with modern technologies is the one of the main goals of the project. When trading strategies for Crude Oil WTI, Natural Gas and Gold are done they will be included in metastrategy-driven portfolio with smart asset allocation.

Mind-flow:

  • Crude Oil WTI, Natural Gas, Gold futures contracts
  • Implementation using  modern frameworks and libraries: Zipline, Pyfolio, Pandas, MetaTrader, NumPy
  • Short-term price momentum trading strategies development
  • Our innovation against existing trading strategies is based on implementation of modern frameworks and analytical tools
  • Our research aims to implement proposed trading strategies to the real market in a short period. All strategies will be included in a metastrategy-driven portfolio

 

Jan Budík[:]

Database Creation and Text Analysis in Services

Our EQS software, based on the text analysis in services (e.g. “I am looking for a nursery in Brno which takes children as young as 1 year old”), will present the user with appropriate suppliers.

To correctly pair the data, the search algorithm requires a sufficient amount of data for learning. We have approached this problem by creating web crawlers owing to which we received needed data in the Czech language from external sources. However, predominantly, we are creating our own database of all activities located in the Czech Republic.

 

 

 

 

 

 

 

When creating it, we did not want to be limited only to the set of services (for example, the mentioned nursery or children’s group), nor occupations (teacher, nanny, …) but the aim was to create a complete database of all activities which people can possibly perform. For this reason, we merged mentioned areas and supplemented them with additional activities (for example, “babysitting”, or more detailed “night-time babysitting”).

The primary input for creating the database was the “National System of Occupations” which was further extended by categories from commercial enquiry servers. In this way, we created database areas, or more precisely, type clusters teacher/nanny/teacher assistant (=occupation) + nursery/children’s group (=service) + babysitting/children’s programme (=activity). We collectively refer to these categories as activities.

All activities were supplemented with keywords that are typical for them (children to nurseries, babysitting, …). Since our algorithm attaches the highest weight to keywords, these keywords are far more important than the names of the respective activities, therefore, the above-mentioned clusters are made by a set of words associated with the given activity/areas of activities.

When aggregating keywords, we used both automated and man-made databases and, last but not least, our own descriptions or suggestions of suppliers we have been calling to over the last 6 months to offer them a free presentation of their services on our test portal mojilidi.cz.

The primary database was afterwards published on the above-mentioned website and we started facing the real operation. The ones who were interested in the presentation of their services from any areas, entered a description of their activity to the search bar, for example, “We are running a children’s group in Brno which specialises in ABA therapy, speech therapy or exercising with kids.”

 

 

In response to the analysis of the input text, the users are presented with activities which are identified as most relevant (for example Children’s Group, Night Babysitting, Babysitting or wrongly Nutrition Therapist).

 

 

Users have an option either to apply to an already existing activity or to edit/add keywords and description regarding their services or add a completely new activity.

Adding a new activity is subject to confirmation by an administrator so that there are not double values such as taking care of a kid / taking care of kids. Considering the principles of the search algorithm, adding new similar activities would not be a problem. However, for tracking statistics or applying to an existing activity with the most relevant keywords, we are trying to approve only entirely new / not yet given activities.

By doing this, we have been complementing our own database over a year. Activities with a higher number of users have a database of the most interesting keywords and phrases which are recommended to users straight away at the registration.

 

 

We also track which (and how many) activities include a particular keyword, see keywords listed above. Furthermore, we track the number of competition in individual regions. When comparing results gained from telephone calls with users and also by an analysis of new customers acquired from advertising in particular areas (for example, we are finding out that car repair shops are not interested in registering whereas text proofreaders are highly interested), interesting statistics about individual market segments are being developed.

 

The picture is related to Project Architect activity.

 

The database is constantly growing and updating with the ever-growing number of users. Likewise, our search algorithm is getting better and is offering more relevant results. In the following article, we are going to present how we translated our database into English and German and what interesting features have been accomplished by this.

Jiří Fuchs

[:cs]Výhled pro rok 2019[:en]Outlook 2019[:]

[:cs]Na začátku nového roku si náš tým sestavil výhled výzkumných aktivit pro rok 2019 a prostřednictvím tohoto postu bychom Vás rádi seznámili s nejvýznamnějšími milníky tohoto roku. Začátek nového roku patří dokončování prací na implementaci obchodního přístupu Relative Value. Jak dopadly implementační výsledky Vás budeme informovat v některém z následujících postů. Po dokončení tohoto úkolu si výzkumný tým udělá krátkou odbočku na pole trhů se zlatem. Výsledkem této 14 – ti denní zastávky by měl být profitabilní obchodní systém využívající momentum cenového pohybu futures kontraktů.

Po tomto intermezzu se tým ponoří do nové kapitoly optimalizace portfolia a tvorbě metastrategií. Optimalizace portfolia ruled-based strategií je pro tým velkou výzvou jak z technologického tak z obchodního hlediska. Budeme muset sloučit alfu-generující pravidla s o úroveň vyšší problematikou optimalizace (maximalizace zisku vs. minimalizace rizika, respekt k investičnímu horizontu vs. dostupnost dat a technická proveditelnost exekuce apod.). Současně s otevřením této kapitoly bude pokračovat tvorba nástrojů pro jednotlivé obchodní přístupy, které využívají hedgeové a podílové fondy (Fixed Income, Global Macro, Long/Short).

Kromě těchto úkolů čekají tým i další aktivity – na začátku dubna se budou někteří členové účastnit konference Asset Management 4.0 pořádanou Asociací pro komunikační nástroje a internet věcí. V rámci konference se budou členové týmu postupně vyjadřovat k tématům automatizace rozhodovacích a reportovacích procesů, tvorbě matematicko-statistických modelů, optimalizace a řízení rizika.

Další novinkou, která nás tento rok čeká je naše předsevzetí dát Vám zprávu o naši činnosti každý týden prostřednictvím alespoň jednoho postu. V rámci našich postů bychom Vás také rádi informovali o výsledcích naší činnosti, a to včetně výsledků implementace obchodních přístupů. Budeme se těšit na pravidelná setkávání a přejeme vše nejlepší do nového roku.

Michal Dufek[:en]At the beginning of the new year our team has compiled a list of research activities for 2019 and through this post, we would like to introduce you to the most important milestones. The implementation of the Relative Value trading approach will conclude at the beginning of the new year. We will inform you about the results. After completing the task, the research team will take a short turn onto the field of gold markets. The result of this 14-day sprint should be a profitable trading system using price momentum of futures contracts.

After this sprint, the team will dip into a portfolio optimization process and metastrategy development. Optimization of the rule-based strategy portfolio is a major challenge for the team from technological and commercial point of view. We will have to merge alpha-generating rules with one-level-higher optimization issues (maximizing profit vs. risk minimization, with respect of the investment horizon vs. availability of data and technical feasibility of execution etc.). At the same time as the chapter is opened, the development of instruments for individual business approaches used by hedge funds and mutual funds will continue (Fixed Income, Global Macro, Long/Short).

In addition to these tasks, the team awaits other activities – at the beginning of April some members will participate in the Asset Management 4.0 conference organized by the Association for Communication Tools and the Internet of Things. As a part of the conference, team members will gradually comment on the themes of automation of decision-making and reporting processes, creation of mathematical-statistical models, optimization and risk management.

Another news that awaits us this year is our resolution to give you a report of our activities every week at least in one post. We will also inform you about how our business approaches are being conducted in real terms.

We will look forward to regular meetings and wish you all the best for the new year.

Michal Dufek[:]

Suitable representatives for a set of reviews

As we mentioned in the previous post, our team is working on a project to help you make decisions about buying different products and services. We try to help users create an objective view of the specific items they want to buy by analyzing published reviews of other users. Currently, we’ve downloaded enough reports and product articles in Czech and English language to analyze individual views. In the first phase it was necessary to adapt the obtained texts to the form suitable for analysis.

It was necessary to divide the documents into individual sentences, because users often present more ideas in one document and evaluate more criteria. The next step was to remove insignificant words that do not bring any or just little information value. For example, clutches, prepositions, web addresses, and so on. In this step, we also used our own POS analyzer, which assigns the words in sentence word types, and our own dataset with stop words. In particular, nouns, adjectives and verbs were interesting for us. Subsequently, we worded the words into their basic shapes, by specifying the roots of words.

We have transformed the edited documents into vector shape using tf-ifd and then split them into clusters with the same themes using k-means methods. We have managed to identify approximately diversified clusters with a high degree of internal integrity. Identified topics were related to the main parameters of the product segment surveyed.

The clusters created for the whole segment, based on expert articles, were then used to classify product reviews. From identified clusters for individual reviews, we chose those with the highest predictive value – and are presented as a suitable representative for a given set of reviews. The result of the analysis is shown in the example below.

Jan Přichystal

Walk forward test of correlated pairs

In today’s post, we are returning to the Relative Value Approach topic, where we have put a methodology for analyzing stocks, in particular stock couples that show co-movement. We will use this co-movement to create a market-neutral trading strategy (in this case pair trading).

In the last post we introduced the part of the application that searches in the given stock titles and selects those whose logarithmic price differences are mutually correlated. However, this result is static, valid for the moment we run the application. The user will also be interested in the stability of these pairs (or their correlation) over time. Therefore, the application is extended by a walk forward test, which is testing the development of correlated pairs over time.

Methodology

A non-anchored test is performed within the walk forward test. The beginning and the end are scrolling, ie the test window is not extended. As in the previous case, batching is calculated by linear correlation of logarithmic price differences.

Computation

For each pair of in / out-sample correlation thresholds, a walk-forward test was performed:

  • Each walk forward test is a set of partial tests for different walk-window lengths (walk-windows ranging from 2 years to 20 years with a 1 year step)·
  • Each walk-window has a different length of in / out-sample (1 – 20 years, with 1 year step)·
  • The start of in-sample section has been shifted from the beginning of the data by the length of the out-sample section (eg, the 5-year test starts from 1997 to 2013, ie the end of 2018)·
  • The “survival” of the pairs in each walk-window was monitored, ie the number and / the proportion of those in-sample pairs that were also identified in the out-sample section·
  • For example, 80% means that 10 pairs were found in the in-sample, in out-sample were found 8  of these pairs.

Results

The heatmap summarizes the walk-forward test results for a pair of correlation thresholds (in-samle and out-of-sample data). Each stripe of numbers parallel to the main diagonal in the figure represents the results of a set of tests for a certain walk-window length (the sum of the numbers on X and Y is the window length).

The individual boxes show the results for each combination of in-sample and out-sample window lengths. The horizontal axis (X) shows the in-sample window lengths, the vertical (Y) out-sample windows. E.g. the values ​​for the 5-year window test are the points with coordinates (X, Y): (1, 4), (2, 3), (3, 2), (4, 1). Missing fields below the main (or secondary – see above) diagonal mean that no pair in walk-window in the in-sample was found in any of tests. The shorter the walk-windows, the more times they can move over the full length of the data, ie the more partial tests take place, and thus the resulting aggregate value (average, median, etc.) is more robust.

The more favorable (desirable) results are darker, unfavorable lighter, each metric has its “color” (average – blue, median – green, test properties – red, range – purple).

Mean

In the first image, in the area of ​​narrowing “1” line near the main diagonal, the survival of couples are extremely negative (0 or crossed out) as well as positive (100%). The area copied the narrowing line of “1” in the number of tests performed (the first red chart), ie the tests performed only once (due to long out-sample / sliding windows). The “XOM-CVX” pair is almost always the result, so it’s the strongest pair across different seasons.

Median

The figure below shows similar results as in the previous case, only the “survival rate with the highest frequency” is calculated instead of “average survival rate”. In other words, the average is replaced by median.

Conclusion

The output of this application function is robustness testing, or better stability of pairs over time. Let’s take an example when a user of the application selects to date a list of couples which are suitable for pair trading.

Then the user thinks about the list and thinks about how the list would change if the export was done yesterday or a year ago … The answer is just a walk forward test that tracks and evaluates changes in the list of pairs over time. By using this feature, the user recognize that XOM-CVX (Exxon Mobil Corporation and Chevron Corporation) is one of the pairs with the strongest interrelationship between the analyzed assets.

Michal Dufek