AIMultiple ResearchAIMultiple Research

AIMultiple AI Writer Benchmark Methodology in 2024

Updated on Feb 14
2 min read
Written by
Cem Dilmegani
Cem Dilmegani
Cem Dilmegani

Cem is the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per Similarweb) including 60% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE, NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and media that referenced AIMultiple.

View Full Profile

AIMultiple aims to help buyers identify the right writing assistant for their business.

AIMultiple’s first AI writer benchmark will aim to help marketing teams choose the writing assistant that best fits their business’ needs. The benchmark will assess these aspects:

  • For the resulting articles:
    • Comprehensiveness
    • Readability
    • Truthfulness
    • Correct use of English and grammar
    • Je ne sais quoi (i.e. how attractive / engaging the article is)
  • Customer service
  • Total cost of ownership

What will be the guiding principles?

AIMultiple’s benchmark methodology is designed for an objective and transparent assessment. It also explains participation requirements.

What will be benchmarked?

AIMultiple will share prompts to the UI provided by the AI writing assistants and evaluate the resulting articles.

What is the benchmark dataset?

50 prompts will be created by the AIMultiple team. 25 will be B2C and 25 will be B2B focused. They will be a mix of bottom of the funnel, top of the funnel and middle of the funnel articles.

What is required from the AI writing assistant?

The complete article needs to be returned within 5 minutes of receiving the prompt

How will AIMultiple perform the benchmark?

AIMultiple’s AI writing assistant benchmark aims to closely match the preferences of buyers. They want a solution that provides articles that are at a quality that is as close to be published. Therefore, AIMultiple will measure these metrics:

  • For the resulting articles, industry analysts from AIMultiple’s team that have extensive online writing experience will evaluate the articles in terms of these metrics on a scale of 10. Each evaluator must have produced online articles that receive thousands of visitors per month on competitive topics. Results will be the average of 5 evaluators’ assessments in these dimensions:
    • Comprehensiveness
    • Readability
    • Truthfulness
    • Je ne sais quoi (i.e. how attractive / engaging the article is)
  • Correct use of English and grammar will be measured for each vendor by counting the number of mistakes. AIMultiple will share a grammar mistake/1,000 words ratio for each solution.
  • Customer service: Reviews on B2B review platforms will be analyzed to assess customer satisfaction.
  • Speed: If there are significant differences in speed between the vendors, this will be highlighted.
  • Other features
  • Total cost of ownership: Public cost data published by the vendors will be used to calculate the cost of the benchmark. Vendors’ cost model will also be shared to help buyers compare prices of different vendors.

How will the results be published?

They will be published on AIMultiple.com and will feature graphs that users can leverage to find the right vendor for their business. Different metrics (e.g. manual effort) will be separately presented to create transparency for buyers.

Each participant will receive their detailed results as well as the average results.

Challenges

Writers would normally use the AI assistant output as a starting point not as the final product. This benchmarks aims to measure the quality of this initial product. It would also be interesting to know how the AI assistant supports the writing process. However, measuring writers’ preferences during their writing process would introduce more subjectivity to the process and therefore we will not be considering that in this assessment.

Please note that AIMultiple is in the design phase of the benchmark and changes will be made as AIMultiple gets end user feedback and finalizes the benchmark.

Reach out to AIMultiple team via info@aimultiple.com if you would like to participate in the AIMultiple AI writer benchmark.

Access Cem's 2 decades of B2B tech experience as a tech consultant, enterprise leader, startup entrepreneur & industry analyst. Leverage insights informing top Fortune 500 every month.
Cem Dilmegani
Principal Analyst
Follow on
Cem Dilmegani
Principal Analyst

Cem is the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per Similarweb) including 60% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE, NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and media that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised enterprises on their technology decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.

Sources:

AIMultiple.com Traffic Analytics, Ranking & Audience, Similarweb.
Why Microsoft, IBM, and Google Are Ramping up Efforts on AI Ethics, Business Insider.
Microsoft invests $1 billion in OpenAI to pursue artificial intelligence that’s smarter than we are, Washington Post.
Data management barriers to AI success, Deloitte.
Empowering AI Leadership: AI C-Suite Toolkit, World Economic Forum.
Science, Research and Innovation Performance of the EU, European Commission.
Public-sector digitization: The trillion-dollar challenge, McKinsey & Company.
Hypatos gets $11.8M for a deep learning approach to document processing, TechCrunch.
We got an exclusive look at the pitch deck AI startup Hypatos used to raise $11 million, Business Insider.

To stay up-to-date on B2B tech & accelerate your enterprise:

Follow on

Next to Read

Comments

Your email address will not be published. All fields are required.

0 Comments