AI Coding Standards

At Sailing Byte, we pay close attention to the tools we use. We are aware of technological advancements, but we always prioritise customer data security and business considerations. Only this approach aligns with our philosophy of partnership in projects. With the increasing adoption of AI in the market, we have decided to establish consistent guidelines for its use at Sailing Byte.

Why is this important?

Because a lack of copyright means a lack of ownership of the product, which poses a real business risk.

Imagine your code leaks and is cloned thousands of times, yet you are legally powerless because it turns out the code does not belong to you. Anthropic recently found this out the hard way – 512,000 lines of code leaked from their company, which they could not protect under copyright law.

What will you do when someone asks: “How much of the delivered code was written by AI, and can you prove that we have the rights to it?”. You buy or use code created by a software house on the assumption that you become its legal owner. However, if the software house does not hold the copyright to the generated code from the outset, it cannot transfer ownership of it to you.

Of course, the solution is not to block AI tools within the team. The key is to implement standards for the use of AI that guarantee the appropriate processing of data and code by the AI. Appropriate means standards that actually grant you the economic copyright to the generated code, and, in terms of the data processed, do not infringe on privacy laws or the scope of data processing.

Therefore, to protect our clients whilst still being able to harness the power of AI, at Sailing Byte we have implemented the following principles for the use of artificial intelligence – to the benefit of both us and our clients.

AI-supported coding standards

  1. We only propose AI to our clients where it makes strategic or business sense or provides real value to the user. If using AI in your project doesn’t make sense, we’ll let you know.
  2. We use only AI tools where there is no doubt that we retain control over the “input” and “output” of what is fed into the models. Only this approach guarantees that we can safely transfer all economic copyrights to the entire code to our clients in accordance with the terms of the contract.
  3. We use only AI tools that do not learn from the code and data provided. Only this approach ensures full ownership of the code and limits its potential use by third parties.
  4. We do not transmit any user data to AI during the development process. However, this should be distinguished from the transmission of user data to AI in functionalities implemented on production systems.
  5. We use only providers that comply with GDPR standards. This covers data privacy standards in both the EU and the US, as EU standards are more strict for our use cases.
  6. We—Sailing Byte—are responsible for errors in the code generated by AI, because we are the experts, and the duty of care in this regard rests with us. We will not shift the responsibility for the results of working with the AI tools we have chosen onto you, the client.
  7. The use of AI tools for programming is already factored into all our quotes and proposals. We know which tools and how they can help bring your project to life.
  8. Where code privacy or the use of a specific local model is critical, we can deploy a dedicated private model exclusively for you. However, in such cases, you will cover the cost of maintaining this tool, and it will be added to your invoice.
  9. Where the model’s behaviour needs to be specifically tailored, we can train a model for you. For this, we’ll need a significant amount of data, but this approach significantly improves the model’s performance. If you’re interested in this solution, let us know.
  10. The points above and separate arrangements may be required if you—as our client—choose the AI tools we will work with yourself. In that case, none of the points above may apply, and you—as our client—are responsible for the selected AI tool.

Additionally, we have made a comparison of AI providers and their compliance to help you and us see differences between different providers.

Data processing by AI

Level 0 — Non-personal data – Not subject to the GDPR – We use providers that meet ONLY the above requirements, with a preference for ZDR. No additional consent is required to use AI. Example data:

  • Source code, system architecture
  • Technical documents, logs without identifiers
  • Company financial data (e.g., revenue, costs at the company level)
  • Public company data (KRS, NIP, registered office address)
  • B2B contracts where the party is a company, not a natural person
  • Internal strategies, presentations without personal data

Level 1 — “ordinary” personal data (Art. 6 GDPR) — we do not send this data. We will process it only with the Client’s explicit consent and at their request (for example, as part of a system). Requires a DPA with every processor (gateways, load balancers, model providers). The location of processing is significant. Recommended options include: OpenRouter Enterprise with EU Lock, Cortecs, or Eden AI

Criterion: data that identifies or could identify a natural person. Examples of such data:

  • Employee’s first and last name, work email
  • Customer contact information (natural persons, B2C)
  • User IP logs Content of emails and support tickets containing customer data
  • Invoices issued to natural persons (B2C)
  • CVs, employee HR data

Level 2 — sensitive data “special categories” (Art. 9 GDPR) — we do not transmit this data. We will process it only with the Customer’s explicit consent and willingness, and exclusively in justified cases (for example, as part of the system). This requires additional arrangements regarding the provider and supplementary documentation. As a general rule, processing is prohibited without an explicit legal basis. An example of a recommended stack is LiteLLM + a provider with hosting strictly within the EU (e.g., AWS Bedrock In Region, OVH, Azure Data Zone). A DPIA (Data Protection Impact Assessment) is required.

  • Health status, illnesses, test results
  • PESEL, ID card number Biometric data (faces, fingerprints)
  • Political, religious, or union affiliation
  • Criminal record data
  • Genetic data

This document is expandable

If you believe that your business case does not fit guidelines described above, this does not mean yet that we cannot work together. Contact us, and we will discuss your case in details.